Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/stevedonovan/Penlight.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorsteve donovan <steve.j.donovan@gmail.com>2012-03-26 14:02:03 +0400
committersteve donovan <steve.j.donovan@gmail.com>2012-03-26 14:02:03 +0400
commitf7b00e4649d3c43ff1a45804e643afabb95dc2db (patch)
tree80755e5d4e49377510e8472da2d15a10657f6e2e /docs
parent6896b129f401e772f8b6fc5146a97d751bac2f7f (diff)
updating docs and cleanup
Diffstat (limited to 'docs')
-rw-r--r--docs/api/config.ld5
-rwxr-xr-xdocs/api/gen_modules3
-rw-r--r--docs/api/gen_modules.bat4
-rw-r--r--docs/api/ldoc.css298
-rw-r--r--docs/config.ld4
-rw-r--r--docs/doc.css11
-rw-r--r--docs/docgen.lua44
-rw-r--r--docs/function_index.txt275
-rw-r--r--docs/index.lua83
-rw-r--r--docs/luadoc.css287
-rw-r--r--docs/manual/06-data.md4
-rw-r--r--docs/markdown.lua1388
-rw-r--r--docs/penlight.jpgbin24830 -> 0 bytes
-rw-r--r--docs/penlight.md2702
-rw-r--r--docs/readme.md9
15 files changed, 4 insertions, 5113 deletions
diff --git a/docs/api/config.ld b/docs/api/config.ld
deleted file mode 100644
index 40c618c..0000000
--- a/docs/api/config.ld
+++ /dev/null
@@ -1,5 +0,0 @@
-project = 'Penlight'
-description = '<h1>Penlight Lua Libraries 0.9.5</h1>'
-title = 'Penlight Documentation'
-dir = '.'
-file = '../../lua/pl'
diff --git a/docs/api/gen_modules b/docs/api/gen_modules
deleted file mode 100755
index a7a77f6..0000000
--- a/docs/api/gen_modules
+++ /dev/null
@@ -1,3 +0,0 @@
-PL="../.."
-ldoc --dir . $PL/lua/pl
-
diff --git a/docs/api/gen_modules.bat b/docs/api/gen_modules.bat
deleted file mode 100644
index 6362d48..0000000
--- a/docs/api/gen_modules.bat
+++ /dev/null
@@ -1,4 +0,0 @@
-ldoc --dir . ..\..\lua\pl
-
-
-
diff --git a/docs/api/ldoc.css b/docs/api/ldoc.css
deleted file mode 100644
index 6a86410..0000000
--- a/docs/api/ldoc.css
+++ /dev/null
@@ -1,298 +0,0 @@
-/* BEGIN RESET
-
-Copyright (c) 2010, Yahoo! Inc. All rights reserved.
-Code licensed under the BSD License:
-http://developer.yahoo.com/yui/license.html
-version: 2.8.2r1
-*/
-html {
- color: #000;
- background: #FFF;
-}
-body,div,dl,dt,dd,ul,ol,li,h1,h2,h3,h4,h5,h6,pre,code,form,fieldset,legend,input,button,textarea,p,blockquote,th,td {
- margin: 0;
- padding: 0;
-}
-table {
- border-collapse: collapse;
- border-spacing: 0;
-}
-fieldset,img {
- border: 0;
-}
-address,caption,cite,code,dfn,em,strong,th,var,optgroup {
- font-style: inherit;
- font-weight: inherit;
-}
-del,ins {
- text-decoration: none;
-}
-li {
- list-style: bullet;
- margin-left: 20px;
-}
-caption,th {
- text-align: left;
-}
-h1,h2,h3,h4,h5,h6 {
- font-size: 100%;
- font-weight: bold;
-}
-q:before,q:after {
- content: '';
-}
-abbr,acronym {
- border: 0;
- font-variant: normal;
-}
-sup {
- vertical-align: baseline;
-}
-sub {
- vertical-align: baseline;
-}
-legend {
- color: #000;
-}
-input,button,textarea,select,optgroup,option {
- font-family: inherit;
- font-size: inherit;
- font-style: inherit;
- font-weight: inherit;
-}
-input,button,textarea,select {*font-size:100%;
-}
-/* END RESET */
-
-body {
- margin-left: 1em;
- margin-right: 1em;
- font-family: arial, helvetica, geneva, sans-serif;
- background-color: #ffffff; margin: 0px;
-}
-
-code, tt { font-family: monospace; }
-
-body, p, td, th { font-size: .95em; line-height: 1.2em;}
-
-p, ul { margin: 10px 0 0 10px;}
-
-strong { font-weight: bold;}
-
-em { font-style: italic;}
-
-h1 {
- font-size: 1.5em;
- margin: 0 0 20px 0;
-}
-h2, h3, h4 { margin: 15px 0 10px 0; }
-h2 { font-size: 1.25em; }
-h3 { font-size: 1.15em; }
-h4 { font-size: 1.06em; }
-
-a:link { font-weight: bold; color: #004080; text-decoration: none; }
-a:visited { font-weight: bold; color: #006699; text-decoration: none; }
-a:link:hover { text-decoration: underline; }
-
-hr {
- color:#cccccc;
- background: #00007f;
- height: 1px;
-}
-
-blockquote { margin-left: 3em; }
-
-ul { list-style-type: disc; }
-
-p.name {
- font-family: "Andale Mono", monospace;
- padding-top: 1em;
-}
-
-pre.example {
- background-color: rgb(245, 245, 245);
- border: 1px solid silver;
- padding: 10px;
- margin: 10px 0 10px 0;
- font-family: "Andale Mono", monospace;
- font-size: .85em;
-}
-
-pre {
- background-color: rgb(245, 245, 245);
- border: 1px solid silver;
- padding: 10px;
- margin: 10px 0 10px 0;
- font-family: "Andale Mono", monospace;
-}
-
-
-table.index { border: 1px #00007f; }
-table.index td { text-align: left; vertical-align: top; }
-
-#container {
- margin-left: 1em;
- margin-right: 1em;
- background-color: #f0f0f0;
-}
-
-#product {
- text-align: center;
- border-bottom: 1px solid #cccccc;
- background-color: #ffffff;
-}
-
-#product big {
- font-size: 2em;
-}
-
-#main {
- background-color: #f0f0f0;
- border-left: 2px solid #cccccc;
-}
-
-#navigation {
- float: left;
- width: 18em;
- vertical-align: top;
- background-color: #f0f0f0;
- overflow: visible;
-}
-
-#navigation h2 {
- background-color:#e7e7e7;
- font-size:1.1em;
- color:#000000;
- text-align: left;
- padding:0.2em;
- border-top:1px solid #dddddd;
- border-bottom:1px solid #dddddd;
-}
-
-#navigation ul
-{
- font-size:1em;
- list-style-type: none;
- margin: 1px 1px 10px 1px;
-}
-
-#navigation li {
- text-indent: -1em;
- display: block;
- margin: 3px 0px 0px 22px;
-}
-
-#navigation li li a {
- margin: 0px 3px 0px -1em;
-}
-
-#content {
- margin-left: 18em;
- padding: 1em;
- border-left: 2px solid #cccccc;
- border-right: 2px solid #cccccc;
- background-color: #ffffff;
-}
-
-#about {
- clear: both;
- padding: 5px;
- border-top: 2px solid #cccccc;
- background-color: #ffffff;
-}
-
-@media print {
- body {
- font: 12pt "Times New Roman", "TimeNR", Times, serif;
- }
- a { font-weight: bold; color: #004080; text-decoration: underline; }
-
- #main {
- background-color: #ffffff;
- border-left: 0px;
- }
-
- #container {
- margin-left: 2%;
- margin-right: 2%;
- background-color: #ffffff;
- }
-
- #content {
- padding: 1em;
- background-color: #ffffff;
- }
-
- #navigation {
- display: none;
- }
- pre.example {
- font-family: "Andale Mono", monospace;
- font-size: 10pt;
- page-break-inside: avoid;
- }
-}
-
-table.module_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-table.module_list td.name { background-color: #f0f0f0; }
-table.module_list td.summary { width: 100%; }
-
-table.file_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.file_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.file_list td.name { background-color: #f0f0f0; }
-
-table.file_list td.summary { width: 100%; }
-
-table.function_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.function_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.function_list td.name { background-color: #f0f0f0; min-width: 200px; }
-
-table.function_list td.summary { width: 100%; }
-
-table.table_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.table_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.table_list td.name { background-color: #f0f0f0; min-width: 300px;}
-
-table.table_list td.summary { width: 100%; }
-
-dl.table dt, dl.function dt {border-top: 1px solid #ccc; padding-top: 1em;}
-dl.table dd, dl.function dd {padding-bottom: 1em; margin: 10px 0 0 20px;}
-dl.table h3, dl.function h3 {font-size: .95em;}
diff --git a/docs/config.ld b/docs/config.ld
index 8206fae..55e3988 100644
--- a/docs/config.ld
+++ b/docs/config.ld
@@ -1,8 +1,8 @@
project = 'Penlight'
-description = 'Penlight Lua Libraries 0.9.8'
+description = 'Penlight Lua Libraries 1.0.0'
full_description = 'The documentation is available @{01-introduction.md|here}.'
title = 'Penlight Documentation'
-dir = 'api_docs'
+dir = 'api'
topics = 'manual'
examples = {'../examples','../tests/test-data.lua'}
package = 'pl'
diff --git a/docs/doc.css b/docs/doc.css
deleted file mode 100644
index fc056c1..0000000
--- a/docs/doc.css
+++ /dev/null
@@ -1,11 +0,0 @@
-body {
- padding-left: 2em;
- margin-left: 60px;
- width: 600px;
- }
-pre {
- background-color: #eeeeff
- }
-a:link { font-weight:bold; color: #004080; text-decoration: none; }
-a:visited { font-weight:bold; color: #006699; text-decoration: none; }
-a:link:hover { text-decoration:underline; }
diff --git a/docs/docgen.lua b/docs/docgen.lua
deleted file mode 100644
index 6916a42..0000000
--- a/docs/docgen.lua
+++ /dev/null
@@ -1,44 +0,0 @@
--- massaging @see references in the markdown source.
--- (for a more elegant way of doing this, see seesubst.lua in the examples
--- directory.)
-local lua = arg[-1]
-local markdown_dir = arg[1] or '.'
-if lua:find ' ' then lua = '"'..lua..'"' end
-
-function markdown (file,tmp)
- local tmp_created
- if tmp then
- local f = io.open (tmp,'w')
- for line in io.lines (file) do
- line = line:gsub('@see [%a%.]+',function(s)
- s = s:gsub('@see ','')
- local m,fun = s:match('(.-)%.(.+)')
- if not m then m = s end
- local res = '[see '..s..'](api/modules/pl.'..m..'.html'
- if fun then return res..'#'..s..')'
- else return res..')'
- end
- end)
- f:write(line,'\n')
- end
- f:close()
- tmp_created = true
- else
- tmp = file
- end
- local cmd = lua..' '..markdown_dir..'/markdown.lua -s doc.css '..tmp
- print(cmd)
- os.execute (cmd)
- if tmp_created then os.remove (tmp) end
-end
-
-markdown ('penlight.md','index.txt')
-markdown ('function_index.txt')
-
-
-
-
-
-
-
-
diff --git a/docs/function_index.txt b/docs/function_index.txt
deleted file mode 100644
index 4e076c2..0000000
--- a/docs/function_index.txt
+++ /dev/null
@@ -1,275 +0,0 @@
-#Penlight Function Index
-
-- [I](api/modules/pl.func.html#I)(pl.func) instantiate a PE unless it has already been done
-- [List.range](api/modules/pl.list.html#List.range)(pl.list) Emulate Python's range(x) function
-- [List.split](api/modules/pl.list.html#List.split)(pl.list) split a string using a delimiter
-- [List:__call](api/modules/pl.list.html#List:__call)(pl.list) can iterate over a list directly
-- [List:__concat](api/modules/pl.list.html#List:__concat)(pl.list) concatenation operator
-- [List:__eq](api/modules/pl.list.html#List:__eq)(pl.list) equality operator ==
-- [List:__tostring](api/modules/pl.list.html#List:__tostring)(pl.list) how our list should be rendered as a string
-- [List:append](api/modules/pl.list.html#List:append)(pl.list) Add an item to the end of the list
-- [List:chop](api/modules/pl.list.html#List:chop)(pl.list) Remove a subrange of elements
-- [List:clear](api/modules/pl.list.html#List:clear)(pl.list) empty the list
-- [List:concat](api/modules/pl.list.html#List:concat)(pl.list) join a list of strings
-- [List:contains](api/modules/pl.list.html#List:contains)(pl.list) does this list contain the value?
-- [List:count](api/modules/pl.list.html#List:count)(pl.list) Return the number of times value appears in the list
-- [List:extend](api/modules/pl.list.html#List:extend)(pl.list) Extend the list by appending all the items in the given list
-- [List:filter](api/modules/pl.list.html#List:filter)(pl.list) create a list of all elements which match a function
-- [List:foreach](api/modules/pl.list.html#List:foreach)(pl.list) call the function for each element of the list
-- [List:index](api/modules/pl.list.html#List:index)(pl.list) Return the index in the list of the first item whose value is given
-- [List:insert](api/modules/pl.list.html#List:insert)(pl.list) Insert an item at a given position
-- [List:iter](api/modules/pl.list.html#List:iter)(pl.list) return an iterator over all values
-- [List:join](api/modules/pl.list.html#List:join)(pl.list) join the elements of a list using a delimiter
-- [List:len](api/modules/pl.list.html#List:len)(pl.list) list:len() is the same as #list
-- [List:map](api/modules/pl.list.html#List:map)(pl.list) apply a function to all elements
-- [List:map2](api/modules/pl.list.html#List:map2)(pl.list) apply a function to elements of two lists
-- [List:mapm](api/modules/pl.list.html#List:mapm)(pl.list) apply a named meethod to all elements
-- [List:new](api/modules/pl.list.html#List:new)(pl.list) Create a new list
-- [List:partition](api/modules/pl.list.html#List:partition)(pl.list) partition a list using a classifier function
-- [List:pop](api/modules/pl.list.html#List:pop)(pl.list) Remove the item at the given position in the list, and return it
-- [List:put](api/modules/pl.list.html#List:put)(pl.list) Insert an item at the begining of the list
-- [List:reduce](api/modules/pl.list.html#List:reduce)(pl.list) 'reduce' a list using a binary function
-- [List:remove](api/modules/pl.list.html#List:remove)(pl.list) Remove an element given its index
-- [List:remove_value](api/modules/pl.list.html#List:remove_value)(pl.list) Remove the first item from the list whose value is given
-- [List:reverse](api/modules/pl.list.html#List:reverse)(pl.list) Reverse the elements of the list, in place
-- [List:slice](api/modules/pl.list.html#List:slice)(pl.list) Emulate list slicing
-- [List:slice_assign](api/modules/pl.list.html#List:slice_assign)(pl.list) general slice assignment s[i1:i2] = seq
-- [List:sort](api/modules/pl.list.html#List:sort)(pl.list) Sort the items of the list, in place
-- [List:splice](api/modules/pl.list.html#List:splice)(pl.list) Insert a sublist into a list
-- [List:transform](api/modules/pl.list.html#List:transform)(pl.list) apply a function to all elements, in-place
-- [Map:get](api/modules/pl.class.html#Map:get)(pl.class) get a value from the map
-- [Map:iter](api/modules/pl.class.html#Map:iter)(pl.class) return an iterator over all key-value pairs
-- [Map:keys](api/modules/pl.class.html#Map:keys)(pl.class) list of keys
-- [Map:len](api/modules/pl.class.html#Map:len)(pl.class) size of map
-- [Map:set](api/modules/pl.class.html#Map:set)(pl.class) put a value into the map
-- [Map:values](api/modules/pl.class.html#Map:values)(pl.class) list of values
-- [MultiMap:set](api/modules/pl.classx.html#MultiMap:set)(pl.classx) add a new value to a key
-- [MultiMap:update](api/modules/pl.classx.html#MultiMap:update)(pl.classx) update a MultiMap using a table
-- [OrderedMap:iter](api/modules/pl.classx.html#OrderedMap:iter)(pl.classx) iterate over key-value pairs in order
-- [OrderedMap:keys](api/modules/pl.classx.html#OrderedMap:keys)(pl.classx) return the keys in order
-- [OrderedMap:set](api/modules/pl.classx.html#OrderedMap:set)(pl.classx) set the key's value
-- [OrderedMap:sort](api/modules/pl.classx.html#OrderedMap:sort)(pl.classx) sort the keys
-- [OrderedMap:update](api/modules/pl.classx.html#OrderedMap:update)(pl.classx) update an OrderedMap using a table
-- [OrderedMap:values](api/modules/pl.classx.html#OrderedMap:values)(pl.classx) return the values in order
-- [Set:difference](api/modules/pl.class.html#Set:difference)(pl.class) new set with elements in the set that are not in the other (also -)
-- [Set:intersection](api/modules/pl.class.html#Set:intersection)(pl.class) intersection of two sets (also *)
-- [Set:isdisjoint](api/modules/pl.class.html#Set:isdisjoint)(pl.class) are the sets disjoint? (no elements in common)
-- [Set:issempty](api/modules/pl.class.html#Set:issempty)(pl.class) is the set empty?
-- [Set:issubset](api/modules/pl.class.html#Set:issubset)(pl.class) is the first set a subset of the second?
-- [Set:map](api/modules/pl.class.html#Set:map)(pl.class) map a function over the values of a set
-- [Set:set](api/modules/pl.class.html#Set:set)(pl.class) add a value to a set
-- [Set:union](api/modules/pl.class.html#Set:union)(pl.class) union of two sets (also +)
-- [Set:unset](api/modules/pl.class.html#Set:unset)(pl.class) remove a value from a set
-- [Set:values](api/modules/pl.class.html#Set:values)(pl.class) get a list of the values in a set
-- [Template:indent_substitute](api/modules/pl.text.html#Template:indent_substitute)(pl.text) substitute values into a template, preserving indentation
-- [Template:safe_substitute](api/modules/pl.text.html#Template:safe_substitute)(pl.text) substitute values into a template
-- [Template:substitute](api/modules/pl.text.html#Template:substitute)(pl.text) substitute values into a template, throwing an error
-- [TypedList:append](api/modules/TypedList.html#TypedList:append)(TypedList) append a value to the list
-- [TypedList:extend](api/modules/TypedList.html#TypedList:extend)(TypedList) extend the list using another list
-- [TypedList:slice](api/modules/TypedList.html#TypedList:slice)(TypedList) return a slice of the list
-- [Var](api/modules/pl.func.html#Var)(pl.func) Functional helpers like composition,binding and placeholder expressions
-- [_makepath](api/modules/pl.dir.html#_makepath)(pl.dir) create a directory path
-- [abspath](api/modules/pl.path.html#abspath)(pl.path) return an absolute path
-- [add](api/modules/pl.operator.html#add)(pl.operator) add two values +
-- [alltokens](api/modules/pl.input.html#alltokens)(pl.input) create an iterator over all tokens
-- [appfile](api/modules/pl.path.html#appfile)(pl.path) return a suitable path for files private to this application
-- [args](api/modules/pl.utils.html#args)(pl.utils) take an arbitrary set of arguments and make into a table
-- [asserteq](api/modules/pl.test.html#asserteq)(pl.test) like assert, except takes two arguments that must be equal
-- [asserteq2](api/modules/pl.test.html#asserteq2)(pl.test) a version of asserteq that takes two pairs of values
-- [at](api/modules/pl.stringx.html#at)(pl.stringx) return the 'character' at the index
-- [basename](api/modules/pl.path.html#basename)(pl.path) return the file part of a path
-- [bind](api/modules/pl.func.html#bind)(pl.func) bind the arguments of a function to given values
-- [bind1](api/modules/pl.utils.html#bind1)(pl.utils) bind the first argument of the function to a value
-- [call](api/modules/pl.operator.html#call)(pl.operator) apply function to some arguments ()
-- [center](api/modules/pl.stringx.html#center)(pl.stringx) center-justify s with width w
-- [choose](api/modules/pl.utils.html#choose)(pl.utils) return either of two values, depending on a condition
-- [class](api/modules/pl.class.html#class)(pl.class) create a new class, derived from a given base class
-- [clear](api/modules/pl.tablex.html#clear)(pl.tablex) clear out the contents of a table
-- [clonetree](api/modules/pl.dir.html#clonetree)(pl.dir) clone a directory tree
-- [column](api/modules/pl.array2d.html#column)(pl.array2d) extract a column from the 2D array
-- [common_prefix](api/modules/pl.path.html#common_prefix)(pl.path) return the largest common prefix path of two paths
-- [compare](api/modules/pl.tablex.html#compare)(pl.tablex) compare two list-like tables using a predicate
-- [compare_no_order](api/modules/pl.tablex.html#compare_no_order)(pl.tablex) compare two tables using a predicate, without regard for element order
-- [compile](api/modules/pl.sip.html#compile)(pl.sip) convert a SIP pattern into a matching function
-- [compose](api/modules/pl.func.html#compose)(pl.func) create a function which chains two functions
-- [concat](api/modules/pl.operator.html#concat)(pl.operator) concatenate two values (either strings or __concat defined)
-- [copy](api/modules/pl.tablex.html#copy)(pl.tablex) make a shallow copy of a table
-- [copyfile](api/modules/pl.dir.html#copyfile)(pl.dir) copy a file
-- [count](api/modules/pl.stringx.html#count)(pl.stringx) count all instances of substring in string
-- [count_map](api/modules/pl.tablex.html#count_map)(pl.tablex) A table where the key/values are the values and value counts of the table
-- [cpp](api/modules/pl.lexer.html#cpp)(pl.lexer) create a C/C++ token iterator from a string
-- [create](api/modules/pl.stringio.html#create)(pl.stringio) create a file object which can be used to construct a string
-- [create_getter](api/modules/pl.input.html#create_getter)(pl.input) create a function which grabs the next value from a source
-- [create_pattern](api/modules/pl.sip.html#create_pattern)(pl.sip) convert a SIP pattern into the equivalent Lua regular expression
-- [curry](api/modules/pl.func.html#curry)(pl.func) bind the first parameter of the function to a value
-- [dedent](api/modules/pl.text.html#dedent)(pl.text) dedent a multiline string by removing any initial indent
-- [deepcompare](api/modules/pl.tablex.html#deepcompare)(pl.tablex) compare two values
-- [deepcopy](api/modules/pl.tablex.html#deepcopy)(pl.tablex) make a deep copy of a table, recursively copying all the keys and fields
-- [difference](api/modules/pl.tablex.html#difference)(pl.tablex) a new table which is the difference of two tables
-- [dirname](api/modules/pl.path.html#dirname)(pl.path) return the directory part of a path
-- [div](api/modules/pl.operator.html#div)(pl.operator) divide first value by second /
-- [endswith](api/modules/pl.stringx.html#endswith)(pl.stringx) does string end with the given substring?
-- [eq](api/modules/pl.operator.html#eq)(pl.operator) returns true if arguments are equal ==
-- [escape](api/modules/pl.utils.html#escape)(pl.utils) escape any 'magic' characters in a string
-- [exists](api/modules/pl.path.html#exists)(pl.path) does a path exist?
-- [expandtabs](api/modules/pl.stringx.html#expandtabs)(pl.stringx) replace all tabs in s with n spaces
-- [expanduser](api/modules/pl.path.html#expanduser)(pl.path) Replace a starting '~' with the user's home directory
-- [expecting](api/modules/pl.lexer.html#expecting)(pl.lexer) get the next token, which must be of the expected type
-- [extension](api/modules/pl.path.html#extension)(pl.path) get the extension part of a path
-- [extract_cols](api/modules/pl.array2d.html#extract_cols)(pl.array2d) extract the specified columns
-- [extract_rows](api/modules/pl.array2d.html#extract_rows)(pl.array2d) extract the specified rows
-- [fields](api/modules/pl.sip.html#fields)(pl.sip) given a pattern and a file object, return an iterator over the results
-- [fields](api/modules/pl.input.html#fields)(pl.input) parse an input source into fields
-- [fill](api/modules/pl.text.html#fill)(pl.text) format a paragraph so that it fits into a line width
-- [filter](api/modules/pl.data.html#filter)(pl.data) Filter input using a query
-- [filter](api/modules/pl.tablex.html#filter)(pl.tablex) filter a table's values using a predicate function
-- [filter](api/modules/pl.dir.html#filter)(pl.dir) return a list of all files in a list of files which match the pattern
-- [find](api/modules/pl.tablex.html#find)(pl.tablex) return the index of a value in a list
-- [find_if](api/modules/pl.tablex.html#find_if)(pl.tablex) return the index (or key) of a value in a table using a comparison function
-- [fnmatch](api/modules/pl.dir.html#fnmatch)(pl.dir) does the filename match the shell pattern?
-- [forall](api/modules/pl.array2d.html#forall)(pl.array2d) perform an operation for all values in a 2D array
-- [foreach](api/modules/pl.tablex.html#foreach)(pl.tablex) apply a function to all elements of a table
-- [foreachi](api/modules/pl.tablex.html#foreachi)(pl.tablex) apply a function to all elements of a list-like table in order
-- [fprintf](api/modules/pl.utils.html#fprintf)(pl.utils) write an arbitrary number of arguments to a file using a format
-- [function_arg](api/modules/pl.utils.html#function_arg)(pl.utils) process a function argument
-- [ge](api/modules/pl.operator.html#ge)(pl.operator) returns true if a is greater or equal to b >=
-- [get_separated_list](api/modules/pl.lexer.html#get_separated_list)(pl.lexer) get a list of parameters separated by a delimiter from a stream
-- [getatime](api/modules/pl.path.html#getatime)(pl.path) Return the time of last access as the number of seconds since the epoch
-- [getctime](api/modules/pl.path.html#getctime)(pl.path) Return the system's ctime
-- [getdirectories](api/modules/pl.dir.html#getdirectories)(pl.dir) return a list of all subdirectories of the directory
-- [getfiles](api/modules/pl.dir.html#getfiles)(pl.dir) return a list of all files in a directory which match the a shell pattern
-- [getline](api/modules/pl.lexer.html#getline)(pl.lexer) get everything in a stream upto a newline
-- [getmtime](api/modules/pl.path.html#getmtime)(pl.path) Return the time of last modification
-- [getrest](api/modules/pl.lexer.html#getrest)(pl.lexer) get the rest of the stream
-- [getsize](api/modules/pl.path.html#getsize)(pl.path) return size of a file
-- [gt](api/modules/pl.operator.html#gt)(pl.operator) returns true if a is greater than b >
-- [icopy](api/modules/pl.tablex.html#icopy)(pl.tablex) copy an array into another one, resizing the destination if necessary
-- [imap](api/modules/pl.tablex.html#imap)(pl.tablex) apply a function to all values of a list
-- [imap2](api/modules/pl.tablex.html#imap2)(pl.tablex) apply a function to values from two arrays
-- [import](api/modules/pl.func.html#import)(pl.func) wrap a table of functions
-- [import](api/modules/pl.utils.html#import)(pl.utils) take a table and 'inject' it into the local namespace
-- [indent](api/modules/pl.text.html#indent)(pl.text) indent a multiline string
-- [index](api/modules/pl.operator.html#index)(pl.operator) get the indexed value from a table []
-- [index_by](api/modules/pl.tablex.html#index_by)(pl.tablex) return a list of all values in a table indexed by another list
-- [index_map](api/modules/pl.tablex.html#index_map)(pl.tablex) create an index map from a list-like table
-- [insert](api/modules/pl.lexer.html#insert)(pl.lexer) insert tokens into a stream
-- [insertvalues](api/modules/pl.tablex.html#insertvalues)(pl.tablex) insert values into a table
-- [instantiate](api/modules/pl.func.html#instantiate)(pl.func) instantiate a PE into an actual function
-- [is_callable](api/modules/pl.utils.html#is_callable)(pl.utils) is the object either a function or a callable object?
-- [is_type](api/modules/pl.utils.html#is_type)(pl.utils) is the object of the specified type?
-- [isabs](api/modules/pl.path.html#isabs)(pl.path) is this an absolute path?
-- [isalnum](api/modules/pl.stringx.html#isalnum)(pl.stringx) does s only contain alphanumeric characters?
-- [isalpha](api/modules/pl.stringx.html#isalpha)(pl.stringx) does s only contain alphabetic characters?
-- [isdigit](api/modules/pl.stringx.html#isdigit)(pl.stringx) does s only contain digits?
-- [isdir](api/modules/pl.path.html#isdir)(pl.path) is this a directory?
-- [isfile](api/modules/pl.path.html#isfile)(pl.path) is this a file?
-- [islower](api/modules/pl.stringx.html#islower)(pl.stringx) does s only contain lower case characters?
-- [isspace](api/modules/pl.stringx.html#isspace)(pl.stringx) does s only contain spaces?
-- [isupper](api/modules/pl.stringx.html#isupper)(pl.stringx) does s only contain upper case characters?
-- [iter](api/modules/pl.array2d.html#iter)(pl.array2d) iterate over all elements in a 2D array, with optional indices
-- [iter](api/modules/pl.permute.html#iter)(pl.permute) an iterator over all permutations of the elements of a list
-- [iter](api/modules/pl.list.html#iter)(pl.list) Create an iterator over a seqence
-- [join](api/modules/pl.stringx.html#join)(pl.stringx) concatenate the strings using this string as a delimiter
-- [join](api/modules/pl.path.html#join)(pl.path) return the path resulting from combining the two paths
-- [keys](api/modules/pl.tablex.html#keys)(pl.tablex) return all the keys of a table in arbitrary order
-- [land](api/modules/pl.operator.html#land)(pl.operator) true if both values evaluate as true (i
-- [le](api/modules/pl.operator.html#le)(pl.operator) returns true if a is less or equal to b <=
-- [len](api/modules/pl.operator.html#len)(pl.operator) returns length of string or table #
-- [lfind](api/modules/pl.stringx.html#lfind)(pl.stringx) find index of first instance of sub in s from the left
-- [lines](api/modules/pl.stringx.html#lines)(pl.stringx) return an interator over all lines in a string
-- [lines](api/modules/pl.config.html#lines)(pl.config) like io
-- [ljust](api/modules/pl.stringx.html#ljust)(pl.stringx) left-justify s with width w
-- [lnot](api/modules/pl.operator.html#lnot)(pl.operator) false if value evaluates as true (i
-- [lor](api/modules/pl.operator.html#lor)(pl.operator) true if either value evaluate as true (i
-- [lstrip](api/modules/pl.stringx.html#lstrip)(pl.stringx) trim any whitespace on the left of s
-- [lt](api/modules/pl.operator.html#lt)(pl.operator) returns true if a is less than b <
-- [lua](api/modules/pl.lexer.html#lua)(pl.lexer) create a Lua token iterator from a string
-- [makeset](api/modules/pl.tablex.html#makeset)(pl.tablex) create a set from a list-like table
-- [map](api/modules/pl.array2d.html#map)(pl.array2d) map a function over a 2D array
-- [map](api/modules/pl.tablex.html#map)(pl.tablex) apply a function to all values of a table
-- [map2](api/modules/pl.tablex.html#map2)(pl.tablex) apply a function to values from two tables
-- [map2](api/modules/pl.array2d.html#map2)(pl.array2d) map a function over two arrays
-- [map_named_method](api/modules/pl.tablex.html#map_named_method)(pl.tablex) apply a named method to values from a table
-- [mapn](api/modules/pl.tablex.html#mapn)(pl.tablex) Apply a function to a number of tables
-- [match](api/modules/pl.sip.html#match)(pl.sip) match a SIP pattern against a string
-- [match_at_start](api/modules/pl.sip.html#match_at_start)(pl.sip) match a SIP pattern against the start of a string
-- [memoize](api/modules/pl.utils.html#memoize)(pl.utils) 'memoize' a function (cache returned value for next call)
-- [merge](api/modules/pl.tablex.html#merge)(pl.tablex) combine two tables, either as union or intersection
-- [mod](api/modules/pl.operator.html#mod)(pl.operator) modulo; remainder of a divided by b %
-- [move](api/modules/pl.tablex.html#move)(pl.tablex) copy an array into another one
-- [movefile](api/modules/pl.dir.html#movefile)(pl.dir) move a file
-- [mul](api/modules/pl.operator.html#mul)(pl.operator) multiply two values *
-- [neq](api/modules/pl.operator.html#neq)(pl.operator) returns true if arguments are not equal ~=
-- [new](api/modules/pl.tablex.html#new)(pl.tablex) create a new array of specified size with initial value
-- [new](api/modules/pl.data.html#new)(pl.data) create a new dataset from a table of rows
-- [nop](api/modules/pl.operator.html#nop)(pl.operator) the null operation
-- [normcase](api/modules/pl.path.html#normcase)(pl.path) Normalize the case of a pathname
-- [numbers](api/modules/pl.input.html#numbers)(pl.input) generate a sequence of numbers from a source
-- [open](api/modules/pl.stringio.html#open)(pl.stringio) create a file object for reading from a given string
-- [package_path](api/modules/pl.path.html#package_path)(pl.path) return the full path where a particular Lua module would be found
-- [pairmap](api/modules/pl.tablex.html#pairmap)(pl.tablex) call the function with the key and value pairs from a table
-- [parse_args](api/modules/pl.utils.html#parse_args)(pl.utils) parse command-line arguments into flags and parameters
-- [partition](api/modules/pl.stringx.html#partition)(pl.stringx) partition the string using first occurance of a delimiter
-- [pattern](api/modules/pl.sip.html#pattern)(pl.sip) register a match which will be used in the read function
-- [pow](api/modules/pl.operator.html#pow)(pl.operator) raise first to the power of second ^
-- [printf](api/modules/pl.utils.html#printf)(pl.utils) print an arbitrary number of arguments using a format
-- [product](api/modules/pl.array2d.html#product)(pl.array2d) cartesian product of two 1d arrays
-- [query](api/modules/pl.data.html#query)(pl.data) create a query iterator from a select string
-- [quit](api/modules/pl.utils.html#quit)(pl.utils) end this program gracefully
-- [range](api/modules/pl.tablex.html#range)(pl.tablex) generate a table of all numbers in a range
-- [range](api/modules/pl.array2d.html#range)(pl.array2d) get a slice of a 2D array using spreadsheet range notation
-- [read](api/modules/pl.sip.html#read)(pl.sip) enter a loop which applies all registered matches to the input file
-- [read](api/modules/pl.data.html#read)(pl.data) read a delimited file in a Lua table
-- [read](api/modules/pl.config.html#read)(pl.config) read a configuration file into a table
-- [readfile](api/modules/pl.utils.html#readfile)(pl.utils) return the contents of a file as a string
-- [readlines](api/modules/pl.utils.html#readlines)(pl.utils) return the contents of a file as a list of lines
-- [reduce](api/modules/pl.tablex.html#reduce)(pl.tablex) 'reduce' a list using a binary function
-- [reduce2](api/modules/pl.array2d.html#reduce2)(pl.array2d) reduce a 2D array into a scalar, using two operations
-- [reduce_cols](api/modules/pl.array2d.html#reduce_cols)(pl.array2d) reduce the columns using a function
-- [reduce_rows](api/modules/pl.array2d.html#reduce_rows)(pl.array2d) reduce the rows using a function
-- [register](api/modules/pl.func.html#register)(pl.func) register a function for use in placeholder expressions
-- [remove_col](api/modules/pl.array2d.html#remove_col)(pl.array2d) remove a column from an array
-- [remove_row](api/modules/pl.array2d.html#remove_row)(pl.array2d) remove a row from an array
-- [removevalues](api/modules/pl.tablex.html#removevalues)(pl.tablex) remove a range of values from a table
-- [replace](api/modules/pl.stringx.html#replace)(pl.stringx) replace up to n instances of old by new in the string s
-- [repr](api/modules/pl.func.html#repr)(pl.func) create a string representation of a placeholder expression
-- [require_here](api/modules/pl.path.html#require_here)(pl.path) add the current script's path to the Lua module path
-- [rfind](api/modules/pl.stringx.html#rfind)(pl.stringx) find index of first instance of sub in s from the right
-- [rfind](api/modules/pl.tablex.html#rfind)(pl.tablex) return the index of a value in a list, searching from the end
-- [rjust](api/modules/pl.stringx.html#rjust)(pl.stringx) right-justify s with width w
-- [rmtree](api/modules/pl.dir.html#rmtree)(pl.dir) remove a whole directory tree
-- [rpartition](api/modules/pl.stringx.html#rpartition)(pl.stringx) partition the string p using last occurance of a delimiter
-- [rstrip](api/modules/pl.stringx.html#rstrip)(pl.stringx) trim any whitespace on the right of s
-- [scan](api/modules/pl.lexer.html#scan)(pl.lexer) create a plain token iterator from a string
-- [search](api/modules/pl.tablex.html#search)(pl.tablex) find a value in a table by recursive search
-- [set](api/modules/pl.tablex.html#set)(pl.tablex) set an array range to a value
-- [set](api/modules/pl.array2d.html#set)(pl.array2d) set a specified range of an array to a value
-- [size](api/modules/pl.tablex.html#size)(pl.tablex) total number of elements in this table
-- [skipws](api/modules/pl.lexer.html#skipws)(pl.lexer) get the next non-space token from the stream
-- [slice](api/modules/pl.array2d.html#slice)(pl.array2d) get a slice of a 2D array
-- [split](api/modules/pl.stringx.html#split)(pl.stringx) split a string into a list of strings using a pattern
-- [split](api/modules/pl.utils.html#split)(pl.utils) - split a string into a list of strings separated by a delimiter
-- [splitext](api/modules/pl.path.html#splitext)(pl.path) given a path, return the root part and the extension part
-- [splitl](api/modules/pl.utils.html#splitl)(pl.utils) split a string into a list of strings separated by either spaces or commas
-- [splitpath](api/modules/pl.path.html#splitpath)(pl.path) given a path, return the directory part and a file part
-- [splitv](api/modules/pl.stringx.html#splitv)(pl.stringx) split a string using a pattern
-- [splitv](api/modules/pl.utils.html#splitv)(pl.utils) split a string into a number of values
-- [startswith](api/modules/pl.stringx.html#startswith)(pl.stringx) does string start with the substring?
-- [strip](api/modules/pl.stringx.html#strip)(pl.stringx) trim any whitespace on both left and right of s
-- [sub](api/modules/pl.operator.html#sub)(pl.operator) subtract b from a -
-- [swap_cols](api/modules/pl.array2d.html#swap_cols)(pl.array2d) swap two columns of an array
-- [swap_rows](api/modules/pl.array2d.html#swap_rows)(pl.array2d) swap two rows of an array
-- [table](api/modules/pl.operator.html#table)(pl.operator) make a table from the arguments
-- [table](api/modules/pl.permute.html#table)(pl.permute) construct a table containing all the permutations of a list
-- [tail](api/modules/pl.func.html#tail)(pl.func) all elements of a table except the first
-- [timer](api/modules/pl.test.html#timer)(pl.test) Time a function
-- [tmpname](api/modules/pl.path.html#tmpname)(pl.path) Return a suitable full path to a new temporary file name
-- [transform](api/modules/pl.tablex.html#transform)(pl.tablex) apply a function to all values of a table, in-place
-- [unm](api/modules/pl.operator.html#unm)(pl.operator) return the negative of a value -
-- [update](api/modules/pl.tablex.html#update)(pl.tablex) copy a table into another, in-place
-- [values](api/modules/pl.tablex.html#values)(pl.tablex) return all the values of the table in arbitrary order
-- [words](api/modules/pl.input.html#words)(pl.input) generate a sequence of words from a source
-- [wrap](api/modules/pl.text.html#wrap)(pl.text) format a paragraph into lines so that they fit into a line width
-- [write](api/modules/pl.array2d.html#write)(pl.array2d) write a 2D array to a file
-- [writefile](api/modules/pl.utils.html#writefile)(pl.utils) write a string to a file
-- [zip](api/modules/pl.tablex.html#zip)(pl.tablex) return a table where each element is a table of the ith values of an arbitrary
diff --git a/docs/index.lua b/docs/index.lua
deleted file mode 100644
index 6de4713..0000000
--- a/docs/index.lua
+++ /dev/null
@@ -1,83 +0,0 @@
--- indexing a library
-
-require 'pl'
-local match = sip.match_at_start
-
-local module_name
-local funs = List()
-
-function add_function (name,sentence)
- funs:append {name,sentence,module_name}
-end
-
-function process_module (f)
- local state = 'finding_module'
- local res = {}
- module_name = nil
- for line in io.lines(f) do
- if match('--- $',line,res) then
- sentence = res[1]
- local idot = sentence:find '%.'
- if idot then
- sentence = sentence:sub(1,idot-1)
- end
- elseif match('-- @class $v',line,res) then
- if res[1] == 'module' then
- state = 'finding_module_name'
- elseif res[1] == 'function' then
- state = 'finding_function_name'
- end
- elseif match('-- @name $S',line,res) then
- if state == 'finding_module_name' then
- module_name = res[1]
- else
- add_function(res[1],sentence)
- end
- sentence = nil
- elseif match('local function $v',line,res) then
- elseif match('function $S (',line,res) then
- if sentence and module_name then
- add_function(res[1],sentence)
- sentence = nil
- end
- elseif match('module ($q',line,res) then
- if not module_name then
- module_name = res[2] -- res[1] will be the quote used!
- end
- end
- end
-end
-
-local file = arg[1] or utils.quit 'please supply filename or path'
-if path.isfile(file) then
- process_module(arg[1])
-elseif path.isdir(file) then
- local files = dir.getfiles(file,'*.lua')
- for _,f in ipairs(files) do
- if f ~= '.' and f ~= '..' then
---~ print(f)
- process_module(f)
- end
- end
-end
-
-funs:sort(function(t1,t2)
- return t1[1] < t2[1]
-end)
-
-local outf = io.open('function_index.txt','w')
-outf:write('#Penlight Function Index\n\n')
-for i = 1,#funs do
- local t = funs[i]
- local name,mod,descript = t[1],t[3],t[2]
---~ if not mod then mod = '?' end
---~ if not t[3] then t[3] = '??' end
- name = '['..name..']('..'api/modules/'..mod..'.html#'..name..')'
- outf:write('-\t',name..'('..t[3]..')\t'..t[2],'\n')
- --print(t[1]..' ('..t[3]..')',t[2])
- --print(t[1],t[3],t[2])
-end
-outf:close()
-
-
-
diff --git a/docs/luadoc.css b/docs/luadoc.css
deleted file mode 100644
index d22fb10..0000000
--- a/docs/luadoc.css
+++ /dev/null
@@ -1,287 +0,0 @@
-/* BEGIN RESET
-
-Copyright (c) 2010, Yahoo! Inc. All rights reserved.
-Code licensed under the BSD License:
-http://developer.yahoo.com/yui/license.html
-version: 2.8.2r1
-*/
-html {
- color: #000;
- background: #FFF;
-}
-body,div,dl,dt,dd,ul,ol,li,h1,h2,h3,h4,h5,h6,pre,code,form,fieldset,legend,input,button,textarea,p,blockquote,th,td {
- margin: 0;
- padding: 0;
-}
-table {
- border-collapse: collapse;
- border-spacing: 0;
-}
-fieldset,img {
- border: 0;
-}
-address,caption,cite,code,dfn,em,strong,th,var,optgroup {
- font-style: inherit;
- font-weight: inherit;
-}
-del,ins {
- text-decoration: none;
-}
-li {
- list-style: bullet;
- margin-left: 20px;
-}
-caption,th {
- text-align: left;
-}
-h1,h2,h3,h4,h5,h6 {
- font-size: 100%;
- font-weight: bold;
-}
-q:before,q:after {
- content: '';
-}
-abbr,acronym {
- border: 0;
- font-variant: normal;
-}
-sup {
- vertical-align: baseline;
-}
-sub {
- vertical-align: baseline;
-}
-legend {
- color: #000;
-}
-input,button,textarea,select,optgroup,option {
- font-family: inherit;
- font-size: inherit;
- font-style: inherit;
- font-weight: inherit;
-}
-input,button,textarea,select {*font-size:100%;
-}
-/* END RESET */
-
-body {
- margin-left: 1em;
- margin-right: 1em;
- font-family: arial, helvetica, geneva, sans-serif;
- background-color: #ffffff; margin: 0px;
-}
-
-code, tt { font-family: monospace; }
-
-body, p, td, th { font-size: .95em; line-height: 1.2em;}
-
-p, ul { margin: 10px 0 0 10px;}
-
-strong { font-weight: bold;}
-
-h1 {
- font-size: 1.5em;
- margin: 0 0 20px 0;
-}
-h2, h3, h4 { margin: 15px 0 10px 0; }
-h2 { font-size: 1.25em; }
-h3 { font-size: 1.15em; }
-h4 { font-size: 1.06em; }
-
-a:link { font-weight: bold; color: #004080; text-decoration: none; }
-a:visited { font-weight: bold; color: #006699; text-decoration: none; }
-a:link:hover { text-decoration: underline; }
-
-hr {
- color:#cccccc;
- background: #00007f;
- height: 1px;
-}
-
-blockquote { margin-left: 3em; }
-
-ul { list-style-type: disc; }
-
-p.name {
- font-family: "Andale Mono", monospace;
- padding-top: 1em;
-}
-
-pre.example {
- background-color: rgb(245, 245, 245);
- border: 1px solid silver;
- padding: 10px;
- margin: 10px 0 10px 0;
- font-family: "Andale Mono", monospace;
- font-size: .85em;
-}
-
-table.index { border: 1px #00007f; }
-table.index td { text-align: left; vertical-align: top; }
-
-#container {
- margin-left: 1em;
- margin-right: 1em;
- background-color: #f0f0f0;
-}
-
-#product {
- text-align: center;
- border-bottom: 1px solid #cccccc;
- background-color: #ffffff;
-}
-
-#product big {
- font-size: 2em;
-}
-
-#main {
- background-color: #f0f0f0;
- border-left: 2px solid #cccccc;
-}
-
-#navigation {
- float: left;
- width: 18em;
- vertical-align: top;
- background-color: #f0f0f0;
- overflow: visible;
-}
-
-#navigation h2 {
- background-color:#e7e7e7;
- font-size:1.1em;
- color:#000000;
- text-align: left;
- padding:0.2em;
- border-top:1px solid #dddddd;
- border-bottom:1px solid #dddddd;
-}
-
-#navigation ul
-{
- font-size:1em;
- list-style-type: none;
- margin: 1px 1px 10px 1px;
-}
-
-#navigation li {
- text-indent: -1em;
- display: block;
- margin: 3px 0px 0px 22px;
-}
-
-#navigation li li a {
- margin: 0px 3px 0px -1em;
-}
-
-#content {
- margin-left: 18em;
- padding: 1em;
- border-left: 2px solid #cccccc;
- border-right: 2px solid #cccccc;
- background-color: #ffffff;
-}
-
-#about {
- clear: both;
- padding: 5px;
- border-top: 2px solid #cccccc;
- background-color: #ffffff;
-}
-
-@media print {
- body {
- font: 12pt "Times New Roman", "TimeNR", Times, serif;
- }
- a { font-weight: bold; color: #004080; text-decoration: underline; }
-
- #main {
- background-color: #ffffff;
- border-left: 0px;
- }
-
- #container {
- margin-left: 2%;
- margin-right: 2%;
- background-color: #ffffff;
- }
-
- #content {
- padding: 1em;
- background-color: #ffffff;
- }
-
- #navigation {
- display: none;
- }
- pre.example {
- font-family: "Andale Mono", monospace;
- font-size: 10pt;
- page-break-inside: avoid;
- }
-}
-
-table.module_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-table.module_list td.name { background-color: #f0f0f0; }
-table.module_list td.summary { width: 100%; }
-
-table.file_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.file_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.file_list td.name { background-color: #f0f0f0; }
-
-table.file_list td.summary { width: 100%; }
-
-table.function_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.function_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.function_list td.name { background-color: #f0f0f0; }
-
-table.function_list td.summary { width: 100%; }
-
-table.table_list {
- border-width: 1px;
- border-style: solid;
- border-color: #cccccc;
- border-collapse: collapse;
-}
-table.table_list td {
- border-width: 1px;
- padding: 3px;
- border-style: solid;
- border-color: #cccccc;
-}
-
-table.table_list td.name { background-color: #f0f0f0; }
-
-table.table_list td.summary { width: 100%; }
-
-dl.table dt, dl.function dt {border-top: 1px solid #ccc; padding-top: 1em;}
-dl.table dd, dl.function dd {padding-bottom: 1em; margin: 10px 0 0 20px;}
-dl.table h3, dl.function h3 {font-size: .95em;} \ No newline at end of file
diff --git a/docs/manual/06-data.md b/docs/manual/06-data.md
index 5d24be7..6cb82af 100644
--- a/docs/manual/06-data.md
+++ b/docs/manual/06-data.md
@@ -183,7 +183,7 @@ For this to work, _field names must be Lua identifiers_. So `read` will massage
engineering,1501,maintenance,3
engineering,1433,maintenance,10
-The task is to reduce the dataset to a relevant set of rows and columns, perhaps do some processing on row data, and write the result out to a new CSV file. The `write_row` method uses the delimiter to write the row to a file; `select_row` is like `select`, except it iterates over _rows_, not fields; this is necessary if we are dealing with a lot of columns!
+The task is to reduce the dataset to a relevant set of rows and columns, perhaps do some processing on row data, and write the result out to a new CSV file. The `write_row` method uses the delimiter to write the row to a file; `Data.select_row` is like `Data.select`, except it iterates over _rows_, not fields; this is necessary if we are dealing with a lot of columns!
names = {[1501]='don',[1433]='dilbert'}
keepcols = {'Employee_ID','Hours_Booked'}
@@ -197,7 +197,7 @@ The task is to reduce the dataset to a relevant set of rows and columns, perhaps
t:write_row(outf,row)
end
-`select_row` and `select` can be passed a table specifying the query; a list of field names, a function defining the condition and an optional parameter `sort_by`. It isn't really necessary here, but if we had a more complicated row condition (such as belonging to a specified set) then it is not generally possible to express such a condition as a query string, without resorting to hackery such as global variables.
+`Data.select_row` and `Data.select` can be passed a table specifying the query; a list of field names, a function defining the condition and an optional parameter `sort_by`. It isn't really necessary here, but if we had a more complicated row condition (such as belonging to a specified set) then it is not generally possible to express such a condition as a query string, without resorting to hackery such as global variables.
Data does not have to come from files, nor does it necessarily come from the lab or the accounts department. On Linux, `ps aux` gives you a full listing of all processes running on your machine. It is straightforward to feed the output of this command into `data.read` and perform useful queries on it. Notice that non-identifier characters like '%' get converted into underscores:
diff --git a/docs/markdown.lua b/docs/markdown.lua
deleted file mode 100644
index f23954f..0000000
--- a/docs/markdown.lua
+++ /dev/null
@@ -1,1388 +0,0 @@
-#!/usr/bin/env lua
-
---[[
-# markdown.lua -- version 0.32
-
-<http://www.frykholm.se/files/markdown.lua>
-
-**Author:** Niklas Frykholm, <niklas@frykholm.se>
-**Date:** 31 May 2008
-
-This is an implementation of the popular text markup language Markdown in pure Lua.
-Markdown can convert documents written in a simple and easy to read text format
-to well-formatted HTML. For a more thourough description of Markdown and the Markdown
-syntax, see <http://daringfireball.net/projects/markdown>.
-
-The original Markdown source is written in Perl and makes heavy use of advanced
-regular expression techniques (such as negative look-ahead, etc) which are not available
-in Lua's simple regex engine. Therefore this Lua port has been rewritten from the ground
-up. It is probably not completely bug free. If you notice any bugs, please report them to
-me. A unit test that exposes the error is helpful.
-
-## Usage
-
- require "markdown"
- markdown(source)
-
-``markdown.lua`` exposes a single global function named ``markdown(s)`` which applies the
-Markdown transformation to the specified string.
-
-``markdown.lua`` can also be used directly from the command line:
-
- lua markdown.lua test.md
-
-Creates a file ``test.html`` with the converted content of ``test.md``. Run:
-
- lua markdown.lua -h
-
-For a description of the command-line options.
-
-``markdown.lua`` uses the same license as Lua, the MIT license.
-
-## License
-
-Copyright &copy; 2008 Niklas Frykholm.
-
-Permission is hereby granted, free of charge, to any person obtaining a copy of this
-software and associated documentation files (the "Software"), to deal in the Software
-without restriction, including without limitation the rights to use, copy, modify, merge,
-publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
-to whom the Software is furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in all copies
-or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
-THE SOFTWARE.
-
-## Version history
-
-- **0.32** -- 31 May 2008
- - Fix for links containing brackets
-- **0.31** -- 1 Mar 2008
- - Fix for link definitions followed by spaces
-- **0.30** -- 25 Feb 2008
- - Consistent behavior with Markdown when the same link reference is reused
-- **0.29** -- 24 Feb 2008
- - Fix for <pre> blocks with spaces in them
-- **0.28** -- 18 Feb 2008
- - Fix for link encoding
-- **0.27** -- 14 Feb 2008
- - Fix for link database links with ()
-- **0.26** -- 06 Feb 2008
- - Fix for nested italic and bold markers
-- **0.25** -- 24 Jan 2008
- - Fix for encoding of naked <
-- **0.24** -- 21 Jan 2008
- - Fix for link behavior.
-- **0.23** -- 10 Jan 2008
- - Fix for a regression bug in longer expressions in italic or bold.
-- **0.22** -- 27 Dec 2007
- - Fix for crash when processing blocks with a percent sign in them.
-- **0.21** -- 27 Dec 2007
- - Fix for combined strong and emphasis tags
-- **0.20** -- 13 Oct 2007
- - Fix for < as well in image titles, now matches Dingus behavior
-- **0.19** -- 28 Sep 2007
- - Fix for quotation marks " and ampersands & in link and image titles.
-- **0.18** -- 28 Jul 2007
- - Does not crash on unmatched tags (behaves like standard markdown)
-- **0.17** -- 12 Apr 2007
- - Fix for links with %20 in them.
-- **0.16** -- 12 Apr 2007
- - Do not require arg global to exist.
-- **0.15** -- 28 Aug 2006
- - Better handling of links with underscores in them.
-- **0.14** -- 22 Aug 2006
- - Bug for *`foo()`*
-- **0.13** -- 12 Aug 2006
- - Added -l option for including stylesheet inline in document.
- - Fixed bug in -s flag.
- - Fixed emphasis bug.
-- **0.12** -- 15 May 2006
- - Fixed several bugs to comply with MarkdownTest 1.0 <http://six.pairlist.net/pipermail/markdown-discuss/2004-December/000909.html>
-- **0.11** -- 12 May 2006
- - Fixed bug for escaping `*` and `_` inside code spans.
- - Added license terms.
- - Changed join() to table.concat().
-- **0.10** -- 3 May 2006
- - Initial public release.
-
-// Niklas
-]]
-
-
--- Set up a table for holding local functions to avoid polluting the global namespace
-local M = {}
-local MT = {__index = _G}
-setmetatable(M, MT)
-setfenv(1, M)
-
-----------------------------------------------------------------------
--- Utility functions
-----------------------------------------------------------------------
-
--- Locks table t from changes, writes an error if someone attempts to change the table.
--- This is useful for detecting variables that have "accidently" been made global. Something
--- I tend to do all too much.
-function lock(t)
- function lock_new_index(t, k, v)
- error("module has been locked -- " .. k .. " must be declared local", 2)
- end
-
- local mt = {__newindex = lock_new_index}
- if getmetatable(t) then mt.__index = getmetatable(t).__index end
- setmetatable(t, mt)
-end
-
--- Returns the result of mapping the values in table t through the function f
-function map(t, f)
- local out = {}
- for k,v in pairs(t) do out[k] = f(v,k) end
- return out
-end
-
--- The identity function, useful as a placeholder.
-function identity(text) return text end
-
--- Functional style if statement. (NOTE: no short circuit evaluation)
-function iff(t, a, b) if t then return a else return b end end
-
--- Splits the text into an array of separate lines.
-function split(text, sep)
- sep = sep or "\n"
- local lines = {}
- local pos = 1
- while true do
- local b,e = text:find(sep, pos)
- if not b then table.insert(lines, text:sub(pos)) break end
- table.insert(lines, text:sub(pos, b-1))
- pos = e + 1
- end
- return lines
-end
-
--- Converts tabs to spaces
-function detab(text)
- local tab_width = 4
- local function rep(match)
- local spaces = -match:len()
- while spaces<1 do spaces = spaces + tab_width end
- return match .. string.rep(" ", spaces)
- end
- text = text:gsub("([^\n]-)\t", rep)
- return text
-end
-
--- Applies string.find for every pattern in the list and returns the first match
-function find_first(s, patterns, index)
- local res = {}
- for _,p in ipairs(patterns) do
- local match = {s:find(p, index)}
- if #match>0 and (#res==0 or match[1] < res[1]) then res = match end
- end
- return unpack(res)
-end
-
--- If a replacement array is specified, the range [start, stop] in the array is replaced
--- with the replacement array and the resulting array is returned. Without a replacement
--- array the section of the array between start and stop is returned.
-function splice(array, start, stop, replacement)
- if replacement then
- local n = stop - start + 1
- while n > 0 do
- table.remove(array, start)
- n = n - 1
- end
- for i,v in ipairs(replacement) do
- table.insert(array, start, v)
- end
- return array
- else
- local res = {}
- for i = start,stop do
- table.insert(res, array[i])
- end
- return res
- end
-end
-
--- Outdents the text one step.
-function outdent(text)
- text = "\n" .. text
- text = text:gsub("\n ? ? ?", "\n")
- text = text:sub(2)
- return text
-end
-
--- Indents the text one step.
-function indent(text)
- text = text:gsub("\n", "\n ")
- return text
-end
-
--- Does a simple tokenization of html data. Returns the data as a list of tokens.
--- Each token is a table with a type field (which is either "tag" or "text") and
--- a text field (which contains the original token data).
-function tokenize_html(html)
- local tokens = {}
- local pos = 1
- while true do
- local start = find_first(html, {"<!%-%-", "<[a-z/!$]", "<%?"}, pos)
- if not start then
- table.insert(tokens, {type="text", text=html:sub(pos)})
- break
- end
- if start ~= pos then table.insert(tokens, {type="text", text = html:sub(pos, start-1)}) end
-
- local _, stop
- if html:match("^<!%-%-", start) then
- _,stop = html:find("%-%->", start)
- elseif html:match("^<%?", start) then
- _,stop = html:find("?>", start)
- else
- _,stop = html:find("%b<>", start)
- end
- if not stop then
- -- error("Could not match html tag " .. html:sub(start,start+30))
- table.insert(tokens, {type="text", text=html:sub(start, start)})
- pos = start + 1
- else
- table.insert(tokens, {type="tag", text=html:sub(start, stop)})
- pos = stop + 1
- end
- end
- return tokens
-end
-
-----------------------------------------------------------------------
--- Hash
-----------------------------------------------------------------------
-
--- This is used to "hash" data into alphanumeric strings that are unique
--- in the document. (Note that this is not cryptographic hash, the hash
--- function is not one-way.) The hash procedure is used to protect parts
--- of the document from further processing.
-
-local HASH = {
- -- Has the hash been inited.
- inited = false,
-
- -- The unique string prepended to all hash values. This is to ensure
- -- that hash values do not accidently coincide with an actual existing
- -- string in the document.
- identifier = "",
-
- -- Counter that counts up for each new hash instance.
- counter = 0,
-
- -- Hash table.
- table = {}
-}
-
--- Inits hashing. Creates a hash_identifier that doesn't occur anywhere
--- in the text.
-function init_hash(text)
- HASH.inited = true
- HASH.identifier = ""
- HASH.counter = 0
- HASH.table = {}
-
- local s = "HASH"
- local counter = 0
- local id
- while true do
- id = s .. counter
- if not text:find(id, 1, true) then break end
- counter = counter + 1
- end
- HASH.identifier = id
-end
-
--- Returns the hashed value for s.
-function hash(s)
- assert(HASH.inited)
- if not HASH.table[s] then
- HASH.counter = HASH.counter + 1
- local id = HASH.identifier .. HASH.counter .. "X"
- HASH.table[s] = id
- end
- return HASH.table[s]
-end
-
-----------------------------------------------------------------------
--- Protection
-----------------------------------------------------------------------
-
--- The protection module is used to "protect" parts of a document
--- so that they are not modified by subsequent processing steps.
--- Protected parts are saved in a table for later unprotection
-
--- Protection data
-local PD = {
- -- Saved blocks that have been converted
- blocks = {},
-
- -- Block level tags that will be protected
- tags = {"p", "div", "h1", "h2", "h3", "h4", "h5", "h6", "blockquote",
- "pre", "table", "dl", "ol", "ul", "script", "noscript", "form", "fieldset",
- "iframe", "math", "ins", "del"}
-}
-
--- Pattern for matching a block tag that begins and ends in the leftmost
--- column and may contain indented subtags, i.e.
--- <div>
--- A nested block.
--- <div>
--- Nested data.
--- </div>
--- </div>
-function block_pattern(tag)
- return "\n<" .. tag .. ".-\n</" .. tag .. ">[ \t]*\n"
-end
-
--- Pattern for matching a block tag that begins and ends with a newline
-function line_pattern(tag)
- return "\n<" .. tag .. ".-</" .. tag .. ">[ \t]*\n"
-end
-
--- Protects the range of characters from start to stop in the text and
--- returns the protected string.
-function protect_range(text, start, stop)
- local s = text:sub(start, stop)
- local h = hash(s)
- PD.blocks[h] = s
- text = text:sub(1,start) .. h .. text:sub(stop)
- return text
-end
-
--- Protect every part of the text that matches any of the patterns. The first
--- matching pattern is protected first, etc.
-function protect_matches(text, patterns)
- while true do
- local start, stop = find_first(text, patterns)
- if not start then break end
- text = protect_range(text, start, stop)
- end
- return text
-end
-
--- Protects blocklevel tags in the specified text
-function protect(text)
- -- First protect potentially nested block tags
- text = protect_matches(text, map(PD.tags, block_pattern))
- -- Then protect block tags at the line level.
- text = protect_matches(text, map(PD.tags, line_pattern))
- -- Protect <hr> and comment tags
- text = protect_matches(text, {"\n<hr[^>]->[ \t]*\n"})
- text = protect_matches(text, {"\n<!%-%-.-%-%->[ \t]*\n"})
- return text
-end
-
--- Returns true if the string s is a hash resulting from protection
-function is_protected(s)
- return PD.blocks[s]
-end
-
--- Unprotects the specified text by expanding all the nonces
-function unprotect(text)
- for k,v in pairs(PD.blocks) do
- v = v:gsub("%%", "%%%%")
- text = text:gsub(k, v)
- end
- return text
-end
-
-
-----------------------------------------------------------------------
--- Block transform
-----------------------------------------------------------------------
-
--- The block transform functions transform the text on the block level.
--- They work with the text as an array of lines rather than as individual
--- characters.
-
--- Returns true if the line is a ruler of (char) characters.
--- The line must contain at least three char characters and contain only spaces and
--- char characters.
-function is_ruler_of(line, char)
- if not line:match("^[ %" .. char .. "]*$") then return false end
- if not line:match("%" .. char .. ".*%" .. char .. ".*%" .. char) then return false end
- return true
-end
-
--- Identifies the block level formatting present in the line
-function classify(line)
- local info = {line = line, text = line}
-
- if line:match("^ ") then
- info.type = "indented"
- info.outdented = line:sub(5)
- return info
- end
-
- for _,c in ipairs({'*', '-', '_', '='}) do
- if is_ruler_of(line, c) then
- info.type = "ruler"
- info.ruler_char = c
- return info
- end
- end
-
- if line == "" then
- info.type = "blank"
- return info
- end
-
- if line:match("^(#+)[ \t]*(.-)[ \t]*#*[ \t]*$") then
- local m1, m2 = line:match("^(#+)[ \t]*(.-)[ \t]*#*[ \t]*$")
- info.type = "header"
- info.level = m1:len()
- info.text = m2
- return info
- end
-
- if line:match("^ ? ? ?(%d+)%.[ \t]+(.+)") then
- local number, text = line:match("^ ? ? ?(%d+)%.[ \t]+(.+)")
- info.type = "list_item"
- info.list_type = "numeric"
- info.number = 0 + number
- info.text = text
- return info
- end
-
- if line:match("^ ? ? ?([%*%+%-])[ \t]+(.+)") then
- local bullet, text = line:match("^ ? ? ?([%*%+%-])[ \t]+(.+)")
- info.type = "list_item"
- info.list_type = "bullet"
- info.bullet = bullet
- info.text= text
- return info
- end
-
- if line:match("^>[ \t]?(.*)") then
- info.type = "blockquote"
- info.text = line:match("^>[ \t]?(.*)")
- return info
- end
-
- if is_protected(line) then
- info.type = "raw"
- info.html = unprotect(line)
- return info
- end
-
- info.type = "normal"
- return info
-end
-
--- Find headers constisting of a normal line followed by a ruler and converts them to
--- header entries.
-function headers(array)
- local i = 1
- while i <= #array - 1 do
- if array[i].type == "normal" and array[i+1].type == "ruler" and
- (array[i+1].ruler_char == "-" or array[i+1].ruler_char == "=") then
- local info = {line = array[i].line}
- info.text = info.line
- info.type = "header"
- info.level = iff(array[i+1].ruler_char == "=", 1, 2)
- table.remove(array, i+1)
- array[i] = info
- end
- i = i + 1
- end
- return array
-end
-
--- Find list blocks and convert them to protected data blocks
-function lists(array, sublist)
- local function process_list(arr)
- local function any_blanks(arr)
- for i = 1, #arr do
- if arr[i].type == "blank" then return true end
- end
- return false
- end
-
- local function split_list_items(arr)
- local acc = {arr[1]}
- local res = {}
- for i=2,#arr do
- if arr[i].type == "list_item" then
- table.insert(res, acc)
- acc = {arr[i]}
- else
- table.insert(acc, arr[i])
- end
- end
- table.insert(res, acc)
- return res
- end
-
- local function process_list_item(lines, block)
- while lines[#lines].type == "blank" do
- table.remove(lines)
- end
-
- local itemtext = lines[1].text
- for i=2,#lines do
- itemtext = itemtext .. "\n" .. outdent(lines[i].line)
- end
- if block then
- itemtext = block_transform(itemtext, true)
- if not itemtext:find("<pre>") then itemtext = indent(itemtext) end
- return " <li>" .. itemtext .. "</li>"
- else
- local lines = split(itemtext)
- lines = map(lines, classify)
- lines = lists(lines, true)
- lines = blocks_to_html(lines, true)
- itemtext = table.concat(lines, "\n")
- if not itemtext:find("<pre>") then itemtext = indent(itemtext) end
- return " <li>" .. itemtext .. "</li>"
- end
- end
-
- local block_list = any_blanks(arr)
- local items = split_list_items(arr)
- local out = ""
- for _, item in ipairs(items) do
- out = out .. process_list_item(item, block_list) .. "\n"
- end
- if arr[1].list_type == "numeric" then
- return "<ol>\n" .. out .. "</ol>"
- else
- return "<ul>\n" .. out .. "</ul>"
- end
- end
-
- -- Finds the range of lines composing the first list in the array. A list
- -- starts with (^ list_item) or (blank list_item) and ends with
- -- (blank* $) or (blank normal).
- --
- -- A sublist can start with just (list_item) does not need a blank...
- local function find_list(array, sublist)
- local function find_list_start(array, sublist)
- if array[1].type == "list_item" then return 1 end
- if sublist then
- for i = 1,#array do
- if array[i].type == "list_item" then return i end
- end
- else
- for i = 1, #array-1 do
- if array[i].type == "blank" and array[i+1].type == "list_item" then
- return i+1
- end
- end
- end
- return nil
- end
- local function find_list_end(array, start)
- local pos = #array
- for i = start, #array-1 do
- if array[i].type == "blank" and array[i+1].type ~= "list_item"
- and array[i+1].type ~= "indented" and array[i+1].type ~= "blank" then
- pos = i-1
- break
- end
- end
- while pos > start and array[pos].type == "blank" do
- pos = pos - 1
- end
- return pos
- end
-
- local start = find_list_start(array, sublist)
- if not start then return nil end
- return start, find_list_end(array, start)
- end
-
- while true do
- local start, stop = find_list(array, sublist)
- if not start then break end
- local text = process_list(splice(array, start, stop))
- local info = {
- line = text,
- type = "raw",
- html = text
- }
- array = splice(array, start, stop, {info})
- end
-
- -- Convert any remaining list items to normal
- for _,line in ipairs(array) do
- if line.type == "list_item" then line.type = "normal" end
- end
-
- return array
-end
-
--- Find and convert blockquote markers.
-function blockquotes(lines)
- local function find_blockquote(lines)
- local start
- for i,line in ipairs(lines) do
- if line.type == "blockquote" then
- start = i
- break
- end
- end
- if not start then return nil end
-
- local stop = #lines
- for i = start+1, #lines do
- if lines[i].type == "blank" or lines[i].type == "blockquote" then
- elseif lines[i].type == "normal" then
- if lines[i-1].type == "blank" then stop = i-1 break end
- else
- stop = i-1 break
- end
- end
- while lines[stop].type == "blank" do stop = stop - 1 end
- return start, stop
- end
-
- local function process_blockquote(lines)
- local raw = lines[1].text
- for i = 2,#lines do
- raw = raw .. "\n" .. lines[i].text
- end
- local bt = block_transform(raw)
- if not bt:find("<pre>") then bt = indent(bt) end
- return "<blockquote>\n " .. bt ..
- "\n</blockquote>"
- end
-
- while true do
- local start, stop = find_blockquote(lines)
- if not start then break end
- local text = process_blockquote(splice(lines, start, stop))
- local info = {
- line = text,
- type = "raw",
- html = text
- }
- lines = splice(lines, start, stop, {info})
- end
- return lines
-end
-
--- Find and convert codeblocks.
-function codeblocks(lines)
- local function find_codeblock(lines)
- local start
- for i,line in ipairs(lines) do
- if line.type == "indented" then start = i break end
- end
- if not start then return nil end
-
- local stop = #lines
- for i = start+1, #lines do
- if lines[i].type ~= "indented" and lines[i].type ~= "blank" then
- stop = i-1
- break
- end
- end
- while lines[stop].type == "blank" do stop = stop - 1 end
- return start, stop
- end
-
- local function process_codeblock(lines)
- local raw = detab(encode_code(outdent(lines[1].line)))
- for i = 2,#lines do
- raw = raw .. "\n" .. detab(encode_code(outdent(lines[i].line)))
- end
- return "<pre><code>" .. raw .. "\n</code></pre>"
- end
-
- while true do
- local start, stop = find_codeblock(lines)
- if not start then break end
- local text = process_codeblock(splice(lines, start, stop))
- local info = {
- line = text,
- type = "raw",
- html = text
- }
- lines = splice(lines, start, stop, {info})
- end
- return lines
-end
-
-local idcount = 1
-local list_of_headers = {}
-local first_header
-
--- Convert lines to html code
-function blocks_to_html(lines, no_paragraphs)
- local out = {}
- local i = 1
- while i <= #lines do
- local line = lines[i]
- if line.type == "ruler" then
- table.insert(out, "<hr/>")
- elseif line.type == "raw" then
- table.insert(out, line.html)
- elseif line.type == "normal" then
- local s = line.line
-
- while i+1 <= #lines and lines[i+1].type == "normal" do
- i = i + 1
- s = s .. "\n" .. lines[i].line
- end
-
- if no_paragraphs then
- table.insert(out, span_transform(s))
- else
- table.insert(out, "<p>" .. span_transform(s) .. "</p>")
- end
- elseif line.type == "header" then
- local txt = span_transform(line.text)
- local id = "T" .. idcount
- local s = "<h" .. line.level .. ' id="'.. id .. '">' .. txt .. "</h" .. line.level .. ">"
- if not first_header then
- first_header = {line=s,text=txt}
- else
- table.insert(out, s)
- table.insert(list_of_headers, {level=line.level,text=txt,id=id})
- end
- idcount = idcount + 1
- else
- table.insert(out, line.line)
- end
- i = i + 1
- end
- return out
-end
-
--- Perform all the block level transforms
-function block_transform(text, sublist)
- local lines = split(text)
- lines = map(lines, classify)
- lines = headers(lines)
- lines = lists(lines, sublist)
- lines = codeblocks(lines)
- lines = blockquotes(lines)
- lines = blocks_to_html(lines)
- local text = table.concat(lines, "\n")
- return text
-end
-
--- Debug function for printing a line array to see the result
--- of partial transforms.
-function print_lines(lines)
- for i, line in ipairs(lines) do
- print(i, line.type, line.text or line.line)
- end
-end
-
-----------------------------------------------------------------------
--- Span transform
-----------------------------------------------------------------------
-
--- Functions for transforming the text at the span level.
-
--- These characters may need to be escaped because they have a special
--- meaning in markdown.
-escape_chars = "'\\`*_{}[]()>#+-.!'"
-escape_table = {}
-
-function init_escape_table()
- escape_table = {}
- for i = 1,#escape_chars do
- local c = escape_chars:sub(i,i)
- escape_table[c] = hash(c)
- end
-end
-
--- Adds a new escape to the escape table.
-function add_escape(text)
- if not escape_table[text] then
- escape_table[text] = hash(text)
- end
- return escape_table[text]
-end
-
--- Escape characters that should not be disturbed by markdown.
-function escape_special_chars(text)
- local tokens = tokenize_html(text)
-
- local out = ""
- for _, token in ipairs(tokens) do
- local t = token.text
- if token.type == "tag" then
- -- In tags, encode * and _ so they don't conflict with their use in markdown.
- t = t:gsub("%*", escape_table["*"])
- t = t:gsub("%_", escape_table["_"])
- else
- t = encode_backslash_escapes(t)
- end
- out = out .. t
- end
- return out
-end
-
--- Encode backspace-escaped characters in the markdown source.
-function encode_backslash_escapes(t)
- for i=1,escape_chars:len() do
- local c = escape_chars:sub(i,i)
- t = t:gsub("\\%" .. c, escape_table[c])
- end
- return t
-end
-
--- Unescape characters that have been encoded.
-function unescape_special_chars(t)
- local tin = t
- for k,v in pairs(escape_table) do
- k = k:gsub("%%", "%%%%")
- t = t:gsub(v,k)
- end
- if t ~= tin then t = unescape_special_chars(t) end
- return t
-end
-
--- Encode/escape certain characters inside Markdown code runs.
--- The point is that in code, these characters are literals,
--- and lose their special Markdown meanings.
-function encode_code(s)
- s = s:gsub("%&", "&amp;")
- s = s:gsub("<", "&lt;")
- s = s:gsub(">", "&gt;")
- for k,v in pairs(escape_table) do
- s = s:gsub("%"..k, v)
- end
- return s
-end
-
--- Handle backtick blocks.
-function code_spans(s)
- s = s:gsub("\\\\", escape_table["\\"])
- s = s:gsub("\\`", escape_table["`"])
-
- local pos = 1
- while true do
- local start, stop = s:find("`+", pos)
- if not start then return s end
- local count = stop - start + 1
- -- Find a matching numbert of backticks
- local estart, estop = s:find(string.rep("`", count), stop+1)
- local brstart = s:find("\n", stop+1)
- if estart and (not brstart or estart < brstart) then
- local code = s:sub(stop+1, estart-1)
- code = code:gsub("^[ \t]+", "")
- code = code:gsub("[ \t]+$", "")
- code = code:gsub(escape_table["\\"], escape_table["\\"] .. escape_table["\\"])
- code = code:gsub(escape_table["`"], escape_table["\\"] .. escape_table["`"])
- code = "<code>" .. encode_code(code) .. "</code>"
- code = add_escape(code)
- s = s:sub(1, start-1) .. code .. s:sub(estop+1)
- pos = start + code:len()
- else
- pos = stop + 1
- end
- end
- return s
-end
-
--- Encode alt text... enodes &, and ".
-function encode_alt(s)
- if not s then return s end
- s = s:gsub('&', '&amp;')
- s = s:gsub('"', '&quot;')
- s = s:gsub('<', '&lt;')
- return s
-end
-
--- Handle image references
-function images(text)
- local function reference_link(alt, id)
- alt = encode_alt(alt:match("%b[]"):sub(2,-2))
- id = id:match("%[(.*)%]"):lower()
- if id == "" then id = text:lower() end
- link_database[id] = link_database[id] or {}
- if not link_database[id].url then return nil end
- local url = link_database[id].url or id
- url = encode_alt(url)
- local title = encode_alt(link_database[id].title)
- if title then title = " title=\"" .. title .. "\"" else title = "" end
- return add_escape ('<img src="' .. url .. '" alt="' .. alt .. '"' .. title .. "/>")
- end
-
- local function inline_link(alt, link)
- alt = encode_alt(alt:match("%b[]"):sub(2,-2))
- local url, title = link:match("%(<?(.-)>?[ \t]*['\"](.+)['\"]")
- url = url or link:match("%(<?(.-)>?%)")
- url = encode_alt(url)
- title = encode_alt(title)
- if title then
- return add_escape('<img src="' .. url .. '" alt="' .. alt .. '" title="' .. title .. '"/>')
- else
- return add_escape('<img src="' .. url .. '" alt="' .. alt .. '"/>')
- end
- end
-
- text = text:gsub("!(%b[])[ \t]*\n?[ \t]*(%b[])", reference_link)
- text = text:gsub("!(%b[])(%b())", inline_link)
- return text
-end
-
--- Handle anchor references
-function anchors(text)
- local function reference_link(text, id)
- text = text:match("%b[]"):sub(2,-2)
- id = id:match("%b[]"):sub(2,-2):lower()
- if id == "" then id = text:lower() end
- link_database[id] = link_database[id] or {}
- if not link_database[id].url then return nil end
- local url = link_database[id].url or id
- url = encode_alt(url)
- local title = encode_alt(link_database[id].title)
- if title then title = " title=\"" .. title .. "\"" else title = "" end
- return add_escape("<a href=\"" .. url .. "\"" .. title .. ">") .. text .. add_escape("</a>")
- end
-
- local function inline_link(text, link)
- text = text:match("%b[]"):sub(2,-2)
- local url, title = link:match("%(<?(.-)>?[ \t]*['\"](.+)['\"]")
- title = encode_alt(title)
- url = url or link:match("%(<?(.-)>?%)") or ""
- url = encode_alt(url)
- if title then
- return add_escape("<a href=\"" .. url .. "\" title=\"" .. title .. "\">") .. text .. "</a>"
- else
- return add_escape("<a href=\"" .. url .. "\">") .. text .. add_escape("</a>")
- end
- end
-
- text = text:gsub("(%b[])[ \t]*\n?[ \t]*(%b[])", reference_link)
- text = text:gsub("(%b[])(%b())", inline_link)
- return text
-end
-
--- Handle auto links, i.e. <http://www.google.com/>.
-function auto_links(text)
- local function link(s)
- return add_escape("<a href=\"" .. s .. "\">") .. s .. "</a>"
- end
- -- Encode chars as a mix of dec and hex entitites to (perhaps) fool
- -- spambots.
- local function encode_email_address(s)
- -- Use a deterministic encoding to make unit testing possible.
- -- Code 45% hex, 45% dec, 10% plain.
- local hex = {code = function(c) return "&#x" .. string.format("%x", c:byte()) .. ";" end, count = 1, rate = 0.45}
- local dec = {code = function(c) return "&#" .. c:byte() .. ";" end, count = 0, rate = 0.45}
- local plain = {code = function(c) return c end, count = 0, rate = 0.1}
- local codes = {hex, dec, plain}
- local function swap(t,k1,k2) local temp = t[k2] t[k2] = t[k1] t[k1] = temp end
-
- local out = ""
- for i = 1,s:len() do
- for _,code in ipairs(codes) do code.count = code.count + code.rate end
- if codes[1].count < codes[2].count then swap(codes,1,2) end
- if codes[2].count < codes[3].count then swap(codes,2,3) end
- if codes[1].count < codes[2].count then swap(codes,1,2) end
-
- local code = codes[1]
- local c = s:sub(i,i)
- -- Force encoding of "@" to make email address more invisible.
- if c == "@" and code == plain then code = codes[2] end
- out = out .. code.code(c)
- code.count = code.count - 1
- end
- return out
- end
- local function mail(s)
- s = unescape_special_chars(s)
- local address = encode_email_address("mailto:" .. s)
- local text = encode_email_address(s)
- return add_escape("<a href=\"" .. address .. "\">") .. text .. "</a>"
- end
- -- links
- text = text:gsub("<(https?:[^'\">%s]+)>", link)
- text = text:gsub("<(ftp:[^'\">%s]+)>", link)
-
- -- mail
- text = text:gsub("<mailto:([^'\">%s]+)>", mail)
- text = text:gsub("<([-.%w]+%@[-.%w]+)>", mail)
- return text
-end
-
--- Encode free standing amps (&) and angles (<)... note that this does not
--- encode free >.
-function amps_and_angles(s)
- -- encode amps not part of &..; expression
- local pos = 1
- while true do
- local amp = s:find("&", pos)
- if not amp then break end
- local semi = s:find(";", amp+1)
- local stop = s:find("[ \t\n&]", amp+1)
- if not semi or (stop and stop < semi) or (semi - amp) > 15 then
- s = s:sub(1,amp-1) .. "&amp;" .. s:sub(amp+1)
- pos = amp+1
- else
- pos = amp+1
- end
- end
-
- -- encode naked <'s
- s = s:gsub("<([^a-zA-Z/?$!])", "&lt;%1")
- s = s:gsub("<$", "&lt;")
-
- -- what about >, nothing done in the original markdown source to handle them
- return s
-end
-
--- Handles emphasis markers (* and _) in the text.
-function emphasis(text)
- for _, s in ipairs {"%*%*", "%_%_"} do
- text = text:gsub(s .. "([^%s][%*%_]?)" .. s, "<strong>%1</strong>")
- text = text:gsub(s .. "([^%s][^<>]-[^%s][%*%_]?)" .. s, "<strong>%1</strong>")
- end
- for _, s in ipairs {"%*", "%_"} do
- text = text:gsub(s .. "([^%s_])" .. s, "<em>%1</em>")
- text = text:gsub(s .. "(<strong>[^%s_]</strong>)" .. s, "<em>%1</em>")
- text = text:gsub(s .. "([^%s_][^<>_]-[^%s_])" .. s, "<em>%1</em>")
- text = text:gsub(s .. "([^<>_]-<strong>[^<>_]-</strong>[^<>_]-)" .. s, "<em>%1</em>")
- end
- return text
-end
-
--- Handles line break markers in the text.
-function line_breaks(text)
- return text:gsub(" +\n", " <br/>\n")
-end
-
--- Perform all span level transforms.
-function span_transform(text)
- text = code_spans(text)
- text = escape_special_chars(text)
- text = images(text)
- text = anchors(text)
- text = auto_links(text)
- text = amps_and_angles(text)
- text = emphasis(text)
- text = line_breaks(text)
- return text
-end
-
-----------------------------------------------------------------------
--- Markdown
-----------------------------------------------------------------------
-
--- Cleanup the text by normalizing some possible variations to make further
--- processing easier.
-function cleanup(text)
- -- Standardize line endings
- text = text:gsub("\r\n", "\n") -- DOS to UNIX
- text = text:gsub("\r", "\n") -- Mac to UNIX
-
- -- Convert all tabs to spaces
- text = detab(text)
-
- -- Strip lines with only spaces and tabs
- while true do
- local subs
- text, subs = text:gsub("\n[ \t]+\n", "\n\n")
- if subs == 0 then break end
- end
-
- return "\n" .. text .. "\n"
-end
-
--- Strips link definitions from the text and stores the data in a lookup table.
-function strip_link_definitions(text)
- local linkdb = {}
-
- local function link_def(id, url, title)
- id = id:match("%[(.+)%]"):lower()
- linkdb[id] = linkdb[id] or {}
- linkdb[id].url = url or linkdb[id].url
- linkdb[id].title = title or linkdb[id].title
- return ""
- end
-
- local def_no_title = "\n ? ? ?(%b[]):[ \t]*\n?[ \t]*<?([^%s>]+)>?[ \t]*"
- local def_title1 = def_no_title .. "[ \t]+\n?[ \t]*[\"'(]([^\n]+)[\"')][ \t]*"
- local def_title2 = def_no_title .. "[ \t]*\n[ \t]*[\"'(]([^\n]+)[\"')][ \t]*"
- local def_title3 = def_no_title .. "[ \t]*\n?[ \t]+[\"'(]([^\n]+)[\"')][ \t]*"
-
- text = text:gsub(def_title1, link_def)
- text = text:gsub(def_title2, link_def)
- text = text:gsub(def_title3, link_def)
- text = text:gsub(def_no_title, link_def)
- return text, linkdb
-end
-
-link_database = {}
-
--- Main markdown processing function
-function markdown(text)
- init_hash(text)
- init_escape_table()
-
- text = cleanup(text)
- text = protect(text)
- text, link_database = strip_link_definitions(text)
- text = block_transform(text)
- text = unescape_special_chars(text)
- return text
-end
-
-----------------------------------------------------------------------
--- End of module
-----------------------------------------------------------------------
-
-setfenv(1, _G)
-M.lock(M)
-
--- Expose markdown function to the world
-markdown = M.markdown
-
--- Class for parsing command-line options
-local OptionParser = {}
-OptionParser.__index = OptionParser
-
--- Creates a new option parser
-function OptionParser:new()
- local o = {short = {}, long = {}}
- setmetatable(o, self)
- return o
-end
-
--- Calls f() whenever a flag with specified short and long name is encountered
-function OptionParser:flag(short, long, f)
- local info = {type = "flag", f = f}
- if short then self.short[short] = info end
- if long then self.long[long] = info end
-end
-
--- Calls f(param) whenever a parameter flag with specified short and long name is encountered
-function OptionParser:param(short, long, f)
- local info = {type = "param", f = f}
- if short then self.short[short] = info end
- if long then self.long[long] = info end
-end
-
--- Calls f(v) for each non-flag argument
-function OptionParser:arg(f)
- self.arg = f
-end
-
--- Runs the option parser for the specified set of arguments. Returns true if all arguments
--- where successfully parsed and false otherwise.
-function OptionParser:run(args)
- local pos = 1
- while pos <= #args do
- local arg = args[pos]
- if arg == "--" then
- for i=pos+1,#args do
- if self.arg then self.arg(args[i]) end
- return true
- end
- end
- if arg:match("^%-%-") then
- local info = self.long[arg:sub(3)]
- if not info then print("Unknown flag: " .. arg) return false end
- if info.type == "flag" then
- info.f()
- pos = pos + 1
- else
- param = args[pos+1]
- if not param then print("No parameter for flag: " .. arg) return false end
- info.f(param)
- pos = pos+2
- end
- elseif arg:match("^%-") then
- for i=2,arg:len() do
- local c = arg:sub(i,i)
- local info = self.short[c]
- if not info then print("Unknown flag: -" .. c) return false end
- if info.type == "flag" then
- info.f()
- else
- if i == arg:len() then
- param = args[pos+1]
- if not param then print("No parameter for flag: -" .. c) return false end
- info.f(param)
- pos = pos + 1
- else
- param = arg:sub(i+1)
- info.f(param)
- end
- break
- end
- end
- pos = pos + 1
- else
- if self.arg then self.arg(arg) end
- pos = pos + 1
- end
- end
- return true
-end
-
--- Handles the case when markdown is run from the command line
-local function run_command_line(arg)
- -- Generate output for input s given options
- local function run(s, options)
- s = markdown(s)
- if not options.wrap_header then return s end
- local header = ""
- if options.header then
- local f = io.open(options.header) or error("Could not open file: " .. options.header)
- header = f:read("*a")
- f:close()
- else
- header = [[
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html>
-<head>
- <meta http-equiv="content-type" content="text/html; charset=CHARSET" />
- <title>TITLE</title>
- <link rel="stylesheet" type="text/css" href="STYLESHEET" />
-</head>
-<body>
-]]
- local title = options.title or (first_header and first_header.text) or "Untitled"
- header = header:gsub("TITLE", title)
- if options.inline_style then
- local style = ""
- local f = io.open(options.stylesheet)
- if f then
- style = f:read("*a") f:close()
- else
- error("Could not include style sheet " .. options.stylesheet .. ": File not found")
- end
- header = header:gsub('<link rel="stylesheet" type="text/css" href="STYLESHEET" />',
- "<style type=\"text/css\"><!--\n" .. style .. "\n--></style>")
- else
- header = header:gsub("STYLESHEET", options.stylesheet)
- end
- header = header:gsub("CHARSET", options.charset)
- end
- local footer = "</body></html>"
- if options.footer then
- local f = io.open(options.footer) or error("Could not open file: " .. options.footer)
- footer = f:read("*a")
- f:close()
- end
- if first_header then
- header = header .. first_header.line ..'\n'
- -- Build TOC if required!
- local txt = ''
- local indent = 0
- local last_level
- for i,h in ipairs(list_of_headers) do
- if i > 1 then
- local diff = h.level - last_level
- if diff > 0 then indent = indent + 1
- elseif diff < 0 then indent = indent - 1
- end
- end
- txt = txt..string.rep('\t',indent)..'* ['..h.text..'](#'..h.id..')\n'
- last_level = h.level
- end
- header = header .. markdown(txt)
- end
- return header .. s .. footer
- end
-
- -- Generate output path name from input path name given options.
- local function outpath(path, options)
- if options.append then return path .. ".html" end
- local m = path:match("^(.+%.html)[^/\\]+$") if m then return m end
- m = path:match("^(.+%.)[^/\\]*$") if m and path ~= m .. "html" then return m .. "html" end
- return path .. ".html"
- end
-
- -- Default commandline options
- local options = {
- wrap_header = true,
- header = nil,
- footer = nil,
- charset = "utf-8",
- title = nil,
- stylesheet = "default.css",
- inline_style = false
- }
- local help = [[
-Usage: markdown.lua [OPTION] [FILE]
-Runs the markdown text markup to HTML converter on each file specified on the
-command line. If no files are specified, runs on standard input.
-
-No header:
- -n, --no-wrap Don't wrap the output in <html>... tags.
-Custom header:
- -e, --header FILE Use content of FILE for header.
- -f, --footer FILE Use content of FILE for footer.
-Generated header:
- -c, --charset SET Specifies charset (default utf-8).
- -i, --title TITLE Specifies title (default from first <h1> tag).
- -s, --style STYLE Specifies style sheet file (default default.css).
- -l, --inline-style Include the style sheet file inline in the header.
-Generated files:
- -a, --append Append .html extension (instead of replacing).
-Other options:
- -h, --help Print this help text.
- -t, --test Run the unit tests.
-]]
-
- local run_stdin = true
- local op = OptionParser:new()
- op:flag("n", "no-wrap", function () options.wrap_header = false end)
- op:param("e", "header", function (x) options.header = x end)
- op:param("f", "footer", function (x) options.footer = x end)
- op:param("c", "charset", function (x) options.charset = x end)
- op:param("i", "title", function(x) options.title = x end)
- op:param("s", "style", function(x) options.stylesheet = x end)
- op:flag("l", "inline-style", function(x) options.inline_style = true end)
- op:flag("a", "append", function() options.append = true end)
- op:flag("t", "test", function()
- local n = arg[0]:gsub("markdown.lua", "markdown-tests.lua")
- local f = io.open(n)
- if f then
- f:close() dofile(n)
- else
- error("Cannot find markdown-tests.lua")
- end
- run_stdin = false
- end)
- op:flag("h", "help", function() print(help) run_stdin = false end)
- op:arg(function(path)
- local file = io.open(path) or error("Could not open file: " .. path)
- local s = file:read("*a")
- file:close()
- s = run(s, options)
- file = io.open(outpath(path, options), "w") or error("Could not open output file: " .. outpath(path, options))
- file:write(s)
- file:close()
- run_stdin = false
- end
- )
-
- if not op:run(arg) then
- print(help)
- run_stdin = false
- end
-
- if run_stdin then
- local s = io.read("*a")
- s = run(s, options)
- io.write(s)
- end
-end
-
--- If we are being run from the command-line, act accordingly
-if arg and arg[0]:find("markdown%.lua$") then
- run_command_line(arg)
-else
- return markdown
-end
diff --git a/docs/penlight.jpg b/docs/penlight.jpg
deleted file mode 100644
index 123c0ce..0000000
--- a/docs/penlight.jpg
+++ /dev/null
Binary files differ
diff --git a/docs/penlight.md b/docs/penlight.md
deleted file mode 100644
index 0d47bfe..0000000
--- a/docs/penlight.md
+++ /dev/null
@@ -1,2702 +0,0 @@
-![penlight](penlight.jpg)
-# Penlight - A Portable Lua Library
-
-The module documentation is available [here](api/index.html); and there is an alphabetical [function index](function_index.html).
-
-The latest vesion is available at [Github](http://github.com/stevedonovan/Penlight).
-
-## Introduction
-
-### Purpose
-
-It is often said of Lua that it does not include batteries. That is because the goal of Lua is to produce a lean expressive language that will be used on all sorts of machines, (some of which don't even have hierarchical filesystems). The Lua language is the equivalent of an operating system kernel; the creators of Lua do not see it as their responsibility to create a full software ecosystem around the language. That is the role of the community.
-
-A principle of software design is to recognize common patterns and reuse them. If you find yourself writing things like `io.write(string.format('the answer is %d ',42))` more than a number of times then it becomes useful just to define a function `printf`. This is good, not just because repeated code is harder to maintain, but because such code is easier to read, once people understand your libraries.
-
-Penlight captures many such code patterns, so that the intent of your code becomes clearer. For instance, a Lua idiom to copy a table is `{unpack(t)}`, but this will only work for 'small' tables (for a given value of 'small') so it is not very robust. Also, the intent is not clear. So `tablex.deepcopy` is provided, which will also copy nested tables and and associated metatables, so it can be used to clone complex objects.
-
-The default error handling policy follows that of the Lua standard libraries: if a argument is the wrong type, then an error will be thrown, but otherwise we return `nil,message` if there is a problem. There are some exceptions; functions like `input.fields` default to shutting down the program immediately with a useful message. This is more appropriate behaviour for a _script_ than providing a stack trace. (However, this default can be changed.) The lexer functions always throw errors, to simplify coding, and so should be wrapped in `pcall`.
-
-By default, the error stacktrace starts with your code, since you are not usually interested in the internal details of the library. ??
-
-If you are used to Python conventions, please note that all indices consistently start at 1.
-
-The Lua function `table.foreach` has been deprecated in favour of the `for in` statement, but such an operation becomes particularly useful with the higher-order function support in Penlight. Note that `tablex.foreach` reverses the order, so that the function is passed the value and then the key. Although perverse, this matches the intended use better.
-
-The only important external dependence of Penlight is LuaFileSystem (`lfs`), and if you want `dir.copyfile` to work cleanly on Windows, you will need `alien` as well. (The fallback is to call the equivalent shell commands.)
-
-Some of the examples in this guide were created using [ilua](http://lua-users.org/wiki/InteractiveLua), which doesn't require '=' to print out expressions, and will attempt to print out table results as nicely as possible. This is also available under Lua for Windows, as a library, so the command `lua -lilua -s` will work (the s option switches off 'strict' variable checking, which is annoying and conflicts with the use of `_DEBUG` in some of these libraries.
-
-### To Inject or not to Inject?
-
-It was realized a long time ago that large programs needed a way to keep names distinct by putting them into tables (Lua), namespaces (C++) or modules (Python). It is obviously impossible to run a company where everyone is called 'Bruce', except in Monty Python skits. These 'namespace clashes' are more of a problem in a simple language like Lua than in C++, because C++ does more complicated lookup over 'injected namespaces'. However, in a small group of friends, 'Bruce' is usually unique, so in particular situations it's useful to drop the formality and not use last names. It depends entirely on what kind of program you are writing, whether it is a ten line script or a ten thousand line program.
-
-So the Penlight library provides the formal way and the informal way, without imposing any preference. You can do it formally like:
-
- local utils = require 'pl.utils'
- utils.printf("%s\n","hello, world!")
-
-or informally like:
-
- require 'pl'
- utils.printf("%s\n","That feels better")
-
-`require 'pl'` makes all the separate Penlight modules available, without needing to require them each individually.. Generally, the formal way is better when writing modules, since then there are no global side-effects and the dependencies of your module are made explicit.
-
-With Penlight after 0.9, please note that `require 'pl.utils'` no longer implies that a global table `pl.utils` exists, since these new modules are no longer created with `module()`.
-
-Penlight will not bring in functions into the global table, or clobber standard tables like 'io'. require('pl') will bring tables like 'utils','tablex',etc into the global table _if they are used_. This 'load-on-demand' strategy ensures that the whole kitchen sink is not loaded up front, so this method is as efficient as explicitly loading required modules.
-
-You have an option to bring the `pl.stringx` methods into the standard string table. All strings have a metatable that allows for automatic lookup in `string`, so we can say `s:upper()`. Importing `stringx` allows for its functions to also be called as methods: `s:strip()`,etc:
-
- require 'pl'
- stringx.import()
-
-or, more explicitly:
-
- require('pl.stringx').import()
-
-A more delicate operation is importing tables into the local environment. This is convenient when the context makes the meaning of a name very clear:
-
- > require 'pl'
- > utils.import(math)
- > = sin(1.2)
- 0.93203908596723
-
-`utils.import` can also be passed a module name as a string, which is first required and then imported. If used in a module, `import` will bring the symbols into the module context.
-
-Keeping the global scope simple is very necessary with dynamic languages. Using global variables in a big program is always asking for trouble, especially since you do not have the spell-checking provided by a compiler. The `pl.strict` module enforces a simple rule: globals must be 'declared'. This means that they must be assigned before use; assigning to `nil` is sufficient.
-
- > require 'pl.strict'
- > print(x)
- stdin:1: variable 'x' is not declared
- > x = nil
- > print(x)
- nil
-
-The `strict` module provided by Penlight is compatible with the 'load-on-demand' scheme used by `require 'pl`.
-
-`strict` also disallows assignment to global variables, except in the main program. Generally, modules have no business messing with global scope; if you must do it, then use a call to `rawset`. Similarly, if you have to check for the existance of a global, use `rawget`.
-
-If you wish to enforce strictness globally, then just add `require 'pl.strict'` at the end of `pl/init.lua`.
-
-### What are function arguments in Penlight?
-
-Many functions in Penlight themselves take function arguments, like `map` which applies a function to a list, element by element. You can use existing functions, like `math.max`, anonymous functions (like `function(x,y) return x > y end`), or operations by name (e.g '*' or '..'). The module `pl.operator` exports all the standard Lua operations, like the Python module of the same name. Penlight allows these to be referred to by name, so `operator.gt` can be more concisely expressed as '>'.
-
-Note that the `map` functions pass any extra arguments to the function, so we can have `ls:filter('>',0)`, which is a shortcut for `ls:filter(function(x) return x > 0 end)`.
-
-Finally, `pl.func` supports _placeholder expressions_ in the Boost lambda style, so that an anonymous function to multiply the two arguments can be expressed as `_1*_2`.
-
-To use them directly, note that _all_ function arguments in Penlight go through `utils.function_arg`. `pl.func` registers itself with this function, so that you can directly use placeholder expressions with standard methods:
-
- > _1 = func._1
- > = List{10,20,30}:map(_1+1)
- {11,21,31}
-
-Another option for short anonymous functions is provided by `utils.string_lambda`; since 0.9 you have to explicitly ask for this feature:
-
- > L = require 'pl.utils'.string_lambda
- > = List{10,20,30}:map (L'|x| x + 1')
- {11,21,31}
-
-### Pros and Cons of Loopless Programming
-
-The standard loops-and-ifs 'imperative' style of programming is dominant, and often seems to be the 'natural' way of telling a machine what to do. It is in fact very much how the machine does things, but we need to take a step back and find ways of expressing solutions in a higher-level way. For instance, applying a function to all elements of a list is a common operation:
-
- local res = {}
- for i = 1,#ls do
- res[i] = fun(ls[i])
- end
-
-This can be efficiently and succintly expressed as `ls:map(fun)`. Not only is there less typing but the intention of the code is clearer. If readers of your code spend too much time trying to guess your intention by analyzing your loops, then you have failed to express yourself clearly. Similarly, `ls:filter('>',0)` will give you all the values in a list greater than zero. (Of course, if you don't feel like using `List`, or have non-list-like tables, then `pl.tablex` offers the same facilities. In fact, the `List` methods are implemented using `tablex' functions.)
-
-A common observation is that loopless programming is less efficient, particularly in the way it uses memory. `ls1:map2('*',ls2):reduce '+'` will give you the dot product of two lists, but an unnecessary temporary list is created. But efficiency is relative to the actual situation, it may turn out to be _fast enough_, or may not appear in any crucial inner loops, etc.
-
-Writing loops is 'error-prone and tedious', as Stroustrup says. But any half-decent editor can be taught to do much of that typing for you. The question should actually be: is it tedious to _read_ loops? As with natural language, programmers tend to read chunks at a time. A for-loop causes no surprise, and probably little brain activity. One argument for loopless programming is the loops that you _do_ write stand out more, and signal 'something different happening here'. It should not be an all-or-nothing thing, since most programs require a mixture of idioms that suit the problem. Some languages (like APL) do nearly everything with map and reduce operations on arrays, and so solutions can sometimes seem forced. Wisdom is knowing when a particular idiom makes a particular problem easy to _solve_ and the solution easy to _explain_ afterwards.
-
-<a id="utils"></a>
-
-### Utilities. Generally useful functions.
-
-The function `printf` discussed earlier is included in `pl.utils` because it makes properly formatted output easier. (There is an equivalent `fprintf` which also takes a file object parameter, just like the C function.)
-
-Utility functions like `is_callable` and `is_type` help with identifying what kind of animal you are dealing with. Obviously, a function is callable, but an object can be callable as well if it has overriden the `__call` metamethod. The Lua `type` function handles the basic types, but can't distinguish between different kinds of objects, which are all tables. So `is_type` handles both cases, like `is_type(s,"string")` and `is_type(ls,List)`.
-
-A common pattern when working with Lua varargs is capturing all the arguments in a table:
-
- function t(...)
- local args = {...}
- ...
- end
-
-But this will bite you someday when `nil` is one of the arguments, since this will put a 'hole' in your table. In particular, `#ls` will only give you the size upto the `nil` value. Hence the need for `table.pack` - this is a new Lua 5.2 function which Penlight defines also for Lua 5.1.
-
- function t(...)
- local args,n = table.pack(...)
- for i = 1,n do
- ...
- end
- end
-
-The 'memoize' pattern occurs when you have a function which is expensive to call, but will always return the same value subsequently. `utils.memoize` is given a function, and returns another function. This calls the function the first time, saves the value for that argument, and thereafter for that argument returns the saved value. This is a more flexible alternative to building a table of values upfront, since in general you won't know what values are needed.
-
- sum = utils.memoize(function(n)
- local sum = 0
- for i = 1,n do sum = sum + i end
- return sum
- end)
- ...
- s = sum(1e8) --takes time!
- ...
- s = sum(1e8) --returned saved value!
-
-Penlight is fully compatible with Lua 5.1, 5.2 and LuaJIT 2. To ensure this, `utils` also defines the global Lua 5.2 [load](http://www.lua.org/work/doc/manual.html#pdf-load) function when needed.
-
- * the input (either a string or a function)
- * the source name used in debug information
- * the mode is a string that can have either or both of 'b' or 't', depending on whether the source is a binary chunk or text code (default is 'bt')
- * the environment for the compiled chunk
-
-Using `load` should reduce the need to call the deprecated function `setfenv`, and make your Lua 5.1 code 5.2-friendly.
-
-<a id="app"/>
-### Application Support
-
-`app.parse_args` is a simple command-line argument parser. If called without any arguments, it tries to use the global `arg` array. It returns the _flags_ (options begining with '-') as a table of name/value pairs, and the _arguments_ as an array. It knows about long GNU-style flag names, e.g. `--value`, and groups of short flags are understood, so that `-ab` is short for `-a -b`. The flags result would then look like `{value=true,a=true,b=true}`.
-
-Flags may take values. The command-line `--value=open -n10` would result in `{value='open',n='10'}`; generally you can use '=' or ':' to separate the flag from its value, except in the special case where a short flag is followed by an integer. Or you may specify upfront that some flags have associated values, and then the values will follow the flag.
-
- > require 'pl'
- > flags,args = utils.parse_args({'-o','fred','-n10','fred.txt'},{o=true})
- > pretty.dump(flags)
- {o='fred',n='10'}
-
-`parse_args` is not intelligent or psychic; it will not convert any flag values or arguments for you, or raise errors. For that, have a look at [lapp](#lapp).
-
-An application which consists of several files usually cannot use `require` to load files in the same directory as the main script. `app.require_here()` ensures that the Lua module path is modified so that files found locally are found first. In the `examples` directory, `test-symbols.lua` uses this function to ensure that it can find `symbols.lua` even if it is not run from this directory.
-
-`app.appfile` will create a filename that your application can use to store its private data, based on the script name. For example, `app.appfile "test.txt"` from a script called `testapp.lua` produces the following file on my Windows machine:
-
- C:\Documents and Settings\SJDonova\.testapp\test.txt
-
-and the equivalent on my Linux machine:
-
- /home/sdonovan/.testapp/test.txt
-
-If `.testapp` does not exist, it will be created.
-
-Penlight makes it convenient to save application data in Lua format. You can use `pretty.dump(t,file)` to write a Lua table in a human-readable form to a file, and `pretty.read(file.read(file))` to generate the table again.
-
-(@see app, @see pretty)
-
-
-<a id="class"/>
-
-### Classes. Simplifying Object-Oriented Programming in Lua
-
-Lua is similar to JavaScript in that the concept of class is not directly supported by the language. In fact, Lua has a very general mechanism for extending the behaviour of tables which makes it straightforward to implement classes. A table's behaviour is controlled by its metatable. If that metatable has a `__index` function or table, this will handle looking up anything which is not found in the original table. A class is just a table with an `__index` key pointing to itself. Creating an object involves making a table and setting its metatable to the class; then when handling `obj.fun`, Lua first looks up `fun` in the table `obj`, and if not found it looks it up in the class. `obj:fun(a)` is just short for `obj.fun(obj,a)`. So with the metatable mechanism and this bit of syntactic sugar, it is straightforward to implement classic object orientation.
-
- -- animal.lua
-
- class = require 'pl.class'
-
- class.Animal()
-
- function Animal:_init(name)
- self.name = name
- end
-
- function Animal:__tostring()
- return self.name..': '..self:speak()
- end
-
- class.Dog(Animal)
-
- function Dog:speak()
- return 'bark'
- end
-
- class.Cat(Animal)
-
- function Cat:_init(name,breed)
- self:super(name) -- must init base!
- self.breed = breed
- end
-
- function Cat:speak()
- return 'meow'
- end
-
- class.Lion(Cat)
-
- function Lion:speak()
- return 'roar'
- end
-
- fido = Dog('Fido')
- felix = Cat('Felix','Tabby')
- leo = Lion('Leo','African')
-
- $ lua -i animal.lua
- > = fido,felix,leo
- Fido: bark Felix: meow Leo: roar
- > = leo:is_a(Animal)
- true
- > = leo:is_a(Dog)
- false
- > = leo:is_a(Cat)
- true
-
-All Animal does is define `__tostring`, which Lua will use whenever a string representation is needed of the object. In turn, this relies on `speak`, which is not defined. So it's what C++ people would call an abstract base class; the specific derived classes like Dog define `speak`. (Please note that if derived classes have their own constructors, they must explicitly call the base constructor for their base class; this is conveniently available as the `super` method.)
-
-All such objects will have a `is_a` method, which looks up the inheritance chain to find a match. Another form is `class_of`, which can be safely called on all objects, so instead of `leo:is_a(Animal)` one can say `Animal:class_of(leo)`.
-
-There are two ways to define a class, either `class.Name()` or `Name = class()`; both work identically, except that the first form will always put the class in the current environment (whether global or module); the second form provides more flexibility about where to store the class. The first form does _name_ the class by setting the `_name` field, which can be useful in identifying the objects of this type later. This session illustrates the usefulness of having named classes, if no `__tostring` method is explicitly defined.
-
- > class.Fred()
- > a = Fred()
- > = a
- Fred: 00459330
- > Alice = class()
- > b = Alice()
- > = b
- table: 00459AE8
- > Alice._name = 'Alice'
- > = b
- Alice: 00459AE8
-
-So `Alice = class(); Alice._name = 'Alice'` is exactly the same as `class.Alice()`.
-
-This useful notation is borrowed from Hugo Etchegoyen's [classlib](http://lua-users.org/wiki/MultipleInheritanceClasses) which further extends this concept to allow for multiple inheritance.
-
-
-## Tables and Arrays
-
-<a id="list"/>
-
-### Python-style Lists
-
-One of the elegant things about Lua is that tables do the job of both lists and dicts (as called in Python) or vectors and maps, (as called in C++), and they do it efficiently. However, if we are dealing with 'tables with numerical indices' we may as well call them lists and look for operations which particularly make sense for lists. The Penlight `List` class was originally written by Nick Trout for Lua 5.0, and translated to 5.1 and extended by myself. It seemed that borrowing from Python was a good idea, and this eventually grew into Penlight. (@see List)
-
-Here is an example showing `List` in action; it redefines `__tostring`, so that it can print itself out more sensibly:
-
- > List = require 'pl.List' --> automatic with require 'pl' <---
- > l = List()
- > l:append(10)
- > l:append(20)
- > = l
- {10,20}
- > l:extend {30,40}
- {10,20,30,40}
- > l:insert(1,5)
- {5,10,20,30,40}
- > = l:pop()
- 40
- > = l
- {5,10,20,30}
- > = l:index(30)
- 4
- > = l:contains(30)
- true
- > = l:reverse() ---> note: doesn't make a copy!
- {30,20,10,5}
-
-Although methods like `sort` and `reverse` operate in-place and change the list, they do return the original list. This makes it possible to do _method chaining_, like `ls = ls:append(10):append(20):reverse():append(1)`. But (and this is an important but) no extra copy is made, so `ls` does not change identity. `List` objects (like tables) are _mutable_, unlike strings. If you want a copy of a list, then `List(ls)` will do the job, i.e. it acts like a copy constructor. However, if passed any other table, `List` will just set the metatable of the table and _not_ make a copy.
-
-A particular feature of Python lists is _slicing_. This is fully supported in this version of `List`, except we use 1-based indexing. So `List.slice` works rather like `string.sub`:
-
- > l = List {10,20,30,40}
- > = l:slice(1,1) ---> note: creates a new list!
- {10}
- > = l:slice(2,2)
- {20}
- > = l:slice(2,3)
- {20,30}
- > = l:slice(2,-2)
- {20,30}
- > = l:slice_assign(2,2,{21,22,23})
- {10,21,22,23,30,40}
- > = l:chop(1,1)
- {21,22,23,30,40}
-
-Functions like `slice_assign` and `chop` modify the list; the first is equivalent to Python`l[i1:i2] = seq` and the second to `del l[i1:i2]`.
-
-List objects are ultimately just Lua 'list-like' tables, but they have extra operations defined on them, such as equality and concatention. For regular tables, equality is only true if the two tables are _identical objects_, whereas two lists are equal if they have the same contents, i.e. that `l1[i]==l2[i]` for all elements.
-
- > l1 = List {1,2,3}
- > l2 = List {1,2,3}
- > = l1 == l2
- true
- > = l1..l2
- {1,2,3,1,2,3}
-
-The `List` constructor can be passed a function. If so, it's assumed that this is an iterator function that can be repeatedly called to generate a sequence. One such function is `io.lines`; the following short, intense little script counts the number of lines in standard input:
-
- -- linecount.lua
- require 'pl'
- ls = List(io.lines())
- print(#ls)
-
-`List.iterate` captures what `List` considers a sequence. In particular, it can also iterate over all 'characters' in a string:
-
- > for ch in List.iterate 'help' do io.write(ch,' ') end
- h e l p >
-
-Since the function `iterate` is used internally by the `List` constructor, strings can be made into lists of character strings very easily.
-
-There are a number of operations that go beyond the standard Python methods. For instance, you can _partition_ a list into a table of sublists using a function. In the simplest form, you use a predicate (a function returning a boolean value) to partition the list into two lists, one of elements matching and another of elements not matching. But you can use any function; if we use `type` then the keys will be the standard Lua type names.
-
- > ls = List{1,2,3,4}
- > ops = require 'pl.operator'
- > ls:partition(function(x) return x > 2 end)
- {false={1,2},true={3,4}}
- > ls = List{'one',math.sin,List{1},10,20,List{1,2}}
- > ls:partition(type)
- {function={function: 00369110},string={one},number={10,20},table={{1},{1,2}}}
-
-This is one `List` method which returns a table which is not a `List`. Bear in mind that you can always call a `List` method on a plain table argument, so `List.partition(t,type)` works as expected. But these functions will only operate on the array part of the table.
-
-Stacks occur everywhere in computing. `List` supports stack-like operations; there is already `pop` (remove and return last value) and `append` acts like `push` (add a value to the end). `push` is provided as an alias for `append`, and the other stack operation (size) is simply the size operator `#`. Queues can also be implemented; you use `pop` to take values out of the queue, and `put` to insert a value at the begining.
-
-
-### Map and Set classes
-
-The `Map` class exposes what Python would call a 'dict' interface, and accesses the hash part of the table. The name 'Map' is used to emphasize the interface, not the implementation; it is an object which maps keys onto values; `m['alice']` or the equivalent `m.alice` is the access operation. This class also provides explicit `set` and `get` methods, which are trivial for regular maps but get interesting when `Map` is subclassed. The other operation is `update`, which extends a map by copying the keys and values from another table, perhaps overwriting existing keys:
-
- > Map = require 'pl.Map'
- > m = Map{one=1,two=2}
- > m:update {three=3,four=4,two=20}
- > = m == M{one=1,two=20,three=3,four=4}
- true
-
-The method `values` returns a list of the values, and `keys` returns a list of the keys; there is no guarantee of order. `getvalues` is given a list of keys and returns a list of values associated with these keys:
-
- > m = Map{one=1,two=2,three=3}
- > = m:getvalues {'one','three'}
- {1,3}
- > = m:getvalues(m:keys()) == m:values()
- true
-
-When querying the value of a `Map`, it is best to use the `get` method:
-
- > print(m:get 'one', m:get 'two')
- 1 2
-
-The reason is that `m[key]` can be ambiguous; due to the current implementation, `m["get"]` will always succeed, because if a value is not present in the map, it will be looked up in the `Map` metatable, which contains a method `get`. There is currently no simple solution to this annoying restriction.
-
-A `Set` is a special kind of `Map`, where all the values are `true`. So `get` will always return either `true` or `nil`; all the values are keys, and the order is not important. So in this case `values` is defined to return a list of the keys. Sets can display themselves, and the basic operations like `union` (`+`) and `intersection` (`*`) are defined.
-
- > Set = require 'pl.Set'
- > = Set{'one','two'} == Set{'two','one'}
- true
- > fruit = Set{'apple','banana','orange'}
- > = fruit['banana']
- true
- > = fruit['hazelnut']
- nil
- > = fruit:values()
- {apple,orange,banana}
- > colours = Set{'red','orange','green','blue'}
- > = fruit,colours
- [apple,orange,banana] [blue,green,orange,red]
- > = fruit+colours
- [blue,green,apple,red,orange,banana]
- > = fruit*colours
- [orange]
-
-There are also the methods `difference` and `symmetric_difference`. The first answers the question 'what fruits are not colours?' and the second 'what are fruits and colours but not both?'
-
- > = fruit - colours
- [apple,banana]
- > = fruit ^ colours
- [blue,green,apple,red,banana]
-
-Adding elements to a set is either done like `fruit['peach'] = true` or by `fruit:set('peach')`. Removing is either `fruit['apple'] = nil` or `fruit:unset('apple')`.
-
-There are also some useful classes which also inherit from `Map`. An `OrderedMap` behaves like a `Map` but keeps its keys in order if you use its `set` method to add keys and values. Like all the 'container' classes in Penlight, it defines an `iter` method for iterating over its values; this will return the keys and values in the order of insertion; the `keys` and `values` methods likewise.
-
-A `MultiMap` allows multiple values to be associated with a given key. So `set` (as before) takes a key and a value, but calling it with the same key and a different value does not overwrite but adds a new value. `get` (or using `[]`) will return a list of values.
-
-(@see Map, @see Set)
-
-### Tablex. Useful Operations on Tables
-
-Some notes on terminology: Lua tables are usually _list-like_ (like an array) or _map-like_ (like an associative array or dict); they can of course have a list-like and a map-like part. Some of the table operations only make sense for list-like tables, and some only for map-like tables. (The usual Lua terminology is the array part and the hash part of the table, which reflects the actual implementation used; it is more accurate to say that a Lua table is an associative map which happens to be particularly efficient at acting like an array.)
-
-The functions provided in `table` provide all the basic manipulations on Lua tables, but as we saw with the `List` class, it is useful to build higher-level operations on top of those functions. For instance, to copy a table involves this kind of loop:
-
- local res = {}
- for k,v in pairs(T) do
- res[k] = v
- end
- return res
-
-The `tablex` module (@see tablex) provides this as `copy`, which does a _shallow_ copy of a table. There is also `deepcopy` which goes further than a simple loop in two ways; first, it also gives the copy the same metatable as the original (so it can copy objects like `List` above) and any nested tables will also be copied, to arbitrary depth. There is also `icopy` which operates on list-like tables, where you can set optionally set the start index of the source and destination as well. It ensures that any left-over elements will be deleted:
-
- asserteq(icopy({1,2,3,4,5,6},{20,30}),{20,30}) -- start at 1
- asserteq(icopy({1,2,3,4,5,6},{20,30},2),{1,20,30}) -- start at 2
- asserteq(icopy({1,2,3,4,5,6},{20,30},2,2),{1,30}) -- start at 2, copy from 2
-
-(This code from the `tablex` test module shows the use of `pl.test.asserteq`)
-
-Whereas, `move` overwrites but does not delete the rest of the destination:
-
- asserteq(move({1,2,3,4,5,6},{20,30}),{20,30,3,4,5,6})
- asserteq(move({1,2,3,4,5,6},{20,30},2),{1,20,30,4,5,6})
- asserteq(move({1,2,3,4,5,6},{20,30},2,2),{1,30,3,4,5,6})
-
-(The difference is somewhat like that between C's `strcpy` and `memmove`.)
-
-To summarize, use `copy` or `deepcopy` to make a copy of an arbitrary table. To copy into a map-like table, use `update`; to copy into a list-like table use `icopy`, and `move` if you are updating a range in the destination.
-
-To complete this set of operations, there is `insertvalues` which works like `table.insert` except that one provides a table of values to be inserted, and `removevalues` which removes a range of values.
-
- asserteq(insertvalues({1,2,3,4},2,{20,30}),{1,20,30,2,3,4})
- asserteq(insertvalues({1,2},{3,4}),{1,2,3,4})
-
-Another example:
-
- > T = require 'pl.tablex'
- > t = {10,20,30,40}
- > = T.removevalues(t,2,3)
- {10,40}
- > = T.insertvalues(t,2,{20,30})
- {10,20,30,40}
-
-
-In a similar spirit to `deepcopy`, `deepcompare` will take two tables and return true only if they have exactly the same values and structure.
-
- > t1 = {1,{2,3},4}
- > t2 = deepcopy(t1)
- > = t1 == t2
- false
- > = deepcompare(t1,t2)
- true
-
-`find` will return the index of a given value in a list-like table. Note that like `string.find` you can specify an index to start searching, so that all instances can be found. There is an optional fourth argument, which makes the search start at the end and go backwards, so we could define `rfind` like so:
-
- function rfind(t,val,istart)
- return tablex.find(t,val,istart,true)
- end
-
-`find` does a linear search, so it can slow down code that depends on it. If efficiency is required for large tables, consider using an _index map_. `index_map` will return a table where the keys are the original values of the list, and the associated values are the indices. (It is almost exactly the representation needed for a _set_.)
-
- > t = {'one','two','three'}
- > = tablex.find(t,'two')
- 2
- > = tablex.find(t,'four')
- nil
- > il = tablex.index_map(t)
- > = il['two']
- 2
- > = il.two
- 2
-
-A version of `index_map` called `makeset` is also provided, where the values are just `true`. This is useful because two such sets can be compared for equality using `deepcompare`:
-
- > = deepcompare(makeset {1,2,3},makeset {2,1,3})
- true
-
-Consider the problem of determining the new employees that have joined in a period. Assume we have two files of employee names:
-
- (last-month.txt)
- smith,john
- brady,maureen
- mongale,thabo
-
- (this-month.txt)
- smith,john
- smit,johan
- brady,maureen
- mogale,thabo
- van der Merwe,Piet
-
-To find out differences, just make the employee lists into sets, like so:
-
- require 'pl'
-
- function read_employees(file)
- local ls = List(io.lines(file)) -- a list of employees
- return tablex.makeset(ls)
- end
-
- last = read_employees 'last-month.txt'
- this = read_employees 'this-month.txt'
-
- -- who is in this but not in last?
- diff = tablex.difference(this,last)
-
- -- in a set, the keys are the values...
- for e in pairs(diff) do print(e) end
-
- -- *output*
- -- van der Merwe,Piet
- -- smit,johan
-
-The `difference` operation is easy to write and read:
-
- for e in pairs(this) do
- if not last[e] then
- print(e)
- end
- end
-
-Using `difference` here is not that it is a tricky thing to code, it is that you are stating your intentions clearly to other readers of your code. (And naturally to your future self, in six months time.)
-
-`find_if` will search a table using a function. The optional third argument is a value which will be passed as a second argument to the function. `pl.operator` provides the Lua operators conveniently wrapped as functions, so the basic comparison functions are available:
-
- > ops = require 'pl.operator'
- > = tablex.find_if({10,20,30,40},ops.gt,20)
- 3 true
-
-Note that `find_if` will also return the _actual value_ returned by the function, which of course is usually just `true` for a boolean function, but any value which is not `nil` and not `false` can be usefully passed back.
-
-`deepcompare` does a thorough recursive comparison, but otherwise using the default equality operator. `compare` allows you to specify exactly what function to use when comparing two list-like tables, and `compare_no_order` is true if they contain exactly the same elements. Do note that the latter does not need an explicit comparison function - in this case the implementation is actually to compare the two sets, as above:
-
- > = compare_no_order({1,2,3},{2,1,3})
- true
- > = compare_no_order({1,2,3},{2,1,3},'==')
- true
-
-(Note the special string '==' above; instead of saying `ops.gt` or `ops.eq` we can use the strings '>' or '==' respectively.)
-
-There are several ways to merge tables in PL. If they are list-like, then see the operations defined by `pl.List`, like concatenation. If they are map-like, then `merge` provides two basic operations. If the third arg is false, then the result only contains the keys that are in common between the two tables, and if true, then the result contains all the keys of both tables. These are in fact generalized set union and intersection operations:
-
- > S1 = {john=27,jane=31,mary=24}
- > S2 = {jane=31,jones=50}
- > = tablex.merge(S1, S2, false)
- {jane=31}
- > = tablex.merge(S1, S2, true)
- {mary=24,jane=31,john=27,jones=50}
-
-When working with tables, you will often find yourself writing loops like in the first example. Loops are second nature to programmers, but they are often not the most elegant and self-describing way of expressing an operation. Consider the `map` function, which creates a new table by applying a function to each element of the original:
-
- > = map(math.sin, {1,2,3,4})
- { 0.84, 0.91, 0.14, -0.76}
- > = map(function(x) return x*x end, {1,2,3,4})
- {1,4,9,16}
-
-`map` saves you from writing a loop, and the resulting code is often clearer, as well as being shorter. This is not to say that 'loops are bad' (although you will hear that from some extremists), just that it's good to capture standard patterns. Then the loops you do write will stand out and acquire more significance.
-
-`pairmap` is interesting, because the function works with both the key and the value.
-
- > t = {fred=10,bonzo=20,alice=4}
- > = pairmap(function(k,v) return v end, t)
- {4,10,20}
- > = pairmap(function(k,v) return k end, t)
- {'alice','fred','bonzo'}
-
-(These are common enough operations that the first is defined as `values` and the second as `keys`.) If the function returns two values, then the _second_ value is considered to be the new key:
-
- > = pairmap(t,function(k,v) return v+10, k:upper() end)
- {BONZO=30,FRED=20,ALICE=14}
-
-`map2` applies a function to two tables:
-
- > map2(ops.add,{1,2},{10,20})
- {11,22}
- > map2('*',{1,2},{10,20})
- {10,40}
-
-The various map operations generate tables; `reduce` applies a function of two arguments over a table and returns the result as a scalar:
-
- > reduce ('+', {1,2,3})
- 6
- > reduce ('..', {'one','two','three'})
- 'onetwothree'
-
-Finally, `zip` sews different tables together:
-
- > = zip({1,2,3},{10,20,30})
- {{1,10},{2,20},{3,30}}
-
-Browsing through the documentation, you will find that `tablex` and `List` share methods. For instance, `tablex.imap` and `List.map` are basically the same function; they both operate over the array-part of the table and generate another table. This can also be expressed as a _list comprehension_ `C 'f(x) for x' (t)` which makes the operation more explicit. So why are there different ways to do the same thing? The main reason is that not all tables are Lists: the expression `ls:map('#')` will return a _list_ of the lengths of any elements of `ls`. A list is a thin wrapper around a table, provided by the metatable `List`. Sometimes you may wish to work with ordinary Lua tables; the `List` interface is not a compulsory way to use Penlight table operations.
-
-### Operations on two-dimensional tables
-
-Two-dimensional tables are of course easy to represent in Lua, for instance `{{1,2},{3,4}}` where we store rows as subtables and index like so `A[col][row]`. This is the common representation used by matrix libraries like [LuaMatrix](http://lua-users.org/wiki/LuaMatrix). `pl.array2d` does not provide matrix operations, since that is the job for a specialized library, but rather provides generalizations of the higher-level operations provided by `pl.tablex` for one-dimensional arrays.
-
-`iter` is a useful generalization of `ipairs`. (The extra parameter determines whether you want the indices as well.)
-
- > a = {{1,2},{3,4}}
- > for i,j,v in array2d.iter(a,true) do print(i,j,v) end
- 1 1 1
- 1 2 2
- 2 1 3
- 2 2 4
-
-Note that you can always convert an arbitrary 2D array into a 'list of lists' with `List(tablex.map(List,a))`
-
-`map` will apply a function over all elements (notice that extra arguments can be provided, so this operation is in effect `function(x) return x-1 end`)
-
- > array2d.map('-',a,1)
- {{0,1},{2,3}}
-
-2D arrays are stored as an array of rows, but columns can be extracted:
-
- > array2d.column(a,1)
- {1,3}
-
-There are three equivalents to `tablex.reduce`. You can either reduce along the rows (which is the most efficient) or reduce along the columns. Either one will give you a 1D array. And `reduce2` will apply two operations: the first one reduces the rows, and the second reduces the result.
-
- > array2d.reduce_rows('+',a)
- {3,7}
- > array2d.reduce_cols('+',a)
- {4,6}
- > -- same as tablex.reduce('*',array.reduce_rows('+',a))
- > array2d.reduce2('*','+',a)
- 21 `
-
-`tablex.map2` applies an operation to two tables, giving another table. `array2d.map2` does this for 2D arrays. Note that you have to provide the _rank_ of the arrays involved, since it's hard to always correctly deduce this from the data:
-
- > b = {{10,20},{30,40}}
- > a = {{1,2},{3,4}}
- > = array2d.map2('+',2,2,a,b) -- two 2D arrays
- {{11,22},{33,44}}
- > = array2d.map2('+',1,2,{10,100},a) -- 1D, 2D
- {{11,102},{13,104}}
- > = array2d.map2('*',2,1,a,{1,-1}) -- 2D, 1D
- {{1,-2},{3,-4}}
-
-Of course, you are not limited to simple arithmetic. Say we have a 2D array of strings, and wish to print it out with proper right justification. The first step is to create all the string lengths by mapping `string.len` over the array, the second is to reduce this along the columns using `math.max` to get maximum column widths, and last, apply `string.rjust` with these widths.
-
- maxlens = reduce_cols(math.max,map('#',lines))
- lines = map2(string.rjust,2,1,lines,maxlens)
-
-There is `product` which returns the _Cartesian product_ of two 1D arrays. The result is a 2D array formed from applying the function to all possible pairs from the two arrays.
-
- > array2d.product('{}',{1,2},{'a','b'})
- {{{1,'b'},{2,'a'}},{{1,'a'},{2,'b'}}}
-
-There is a set of operations which work in-place on 2D arrays. You can `swap_rows` and `swap_cols`; the first really is a simple one-liner, but the idea is to give the operation a name. `remove_row` and `remove_col' are generalizations of `table.remove`. Likewise, `extract_rows` and `extract_cols` are given arrays of indices and discard anything else. So, for instance, `extract_cols(A,{2,4})` will leave just columns 2 and 4 in the array.
-
-`List.slice` is often useful on 1D arrays; `array2d.slice` does the same thing, but is generally given a start (row,column) and a end (row,column).
-
- > A = {{1,2,3},{4,5,6},{7,8,9}}
- > B = slice(A,1,1,2,2)
- > write(B)
- 1 2
- 4 5
- > B = slice(A,2,2)
- > write(B,nil,'%4.1f')
- 5.0 6.0
- 8.0 9.0
-
-Here `array2d.write` is used to print out an array nicely; the second parameter is `nil`, which is the default (stdout) but can be any file object and the third parameter is an optional format (as used in `string.format`).
-
-`parse_range` will take a spreadsheet range like 'A1:B2' or 'R1C1:R2C2' and return the range as four numbers, which can be passed to `slice`. The rule is that `slice` will return an array of the appropriate shape depending on the range; if a range represents a row or a column, the result is 1D, otherwise 2D.
-
-This applies to `iter` as well, which can also optionally be given a range:
-
-
- > for i,j,v in iter(A,true,2,2) do print(i,j,v) end
- 2 2 5
- 2 3 6
- 3 2 8
- 3 3 9
-
-(@see array2d)
-
-## Strings. Higher-level operations on strings.
-
-### Extra String Methods
-
-These are convenient borrowings from Python, as described in 3.6.1 of the Python reference, but note that indices in Lua always begin at one. There are methods like `s:isalpha()` and `s:isdigit()`, which return true if s is only composed of letters or digits respectively. `s:startswith()` and `s:endswith()` are convenient ways to find substrings. (`endswith` works as in Python 2.5, so that `f:endswith {'.bat','.exe','.cmd'}` will be true for any filename which ends with these extensions.) There are justify methods and whitespace trimming functions like `strip`.
-
- > stringx.import()
- > ('bonzo.dog'):endswith {'.dog','.cat'}
- true
- > ('bonzo.txt'):endswith {'.dog','.cat'}
- false
- > ('bonzo.cat'):endswith {'.dog','.cat'}
- true
- > (' stuff'):ljust(20,'+')
- '++++++++++++++ stuff'
- > (' stuff '):lstrip()
- 'stuff '
- > (' stuff '):rstrip()
- ' stuff'
- > (' stuff '):strip()
- 'stuff'
- > for s in ('one\ntwo\nthree\n'):lines() do print(s) end
- one
- two
- three
-
-Most of these can be fairly easily implemented using the Lua string library, which is more general and powerful. But they are convenient operations to have easily at hand. Note that can be injected into the `string` table if you use `stringx.import()`, but a simple alias like 'local stringx = require 'pl.stringx'` is preferrable. This is the recommended practice when writing modules for consumption by other people, since it is bad manners to change the global state of the rest of the system.
-
-(@see stringx)
-
-<a id="templates"></a>
-
-### String Templates
-
-Another borrowing from Python, string templates allow you to substitute values looked up in a table:
-
- local Template = require ('pl.text').Template
- t = Template('${here} is the $answer')
- print(t:substitute {here = 'Lua', answer = 'best'})
- ==>
- Lua is the best
-
-'$ variables' can optionally have curly braces; this form is useful if you are glueing text together to make variables, e.g `${prefix}_name_${postfix}`. The `substitute` method will throw an error if a $ variable is not found in the table, and the `safe_substitute` method will not.
-
-The Lua implementation has an extra method, `indent_substitute` which is very useful for inserting blocks of text, because it adjusts indentation. Consider this example:
-
- -- testtemplate.lua
- local stringx = require 'pl.stringx'
- local Template = stringx.Template
-
- t = Template [[
- for i = 1,#$t do
- $body
- end
- ]]
-
- body = Template [[
- local row = $t[i]
- for j = 1,#row do
- fun(row[j])
- end
- ]]
-
- print(t:indent_substitute {body=body,t='tbl'})
-
-And the output is:
-
- for i = 1,#tbl do
- local row = tbl[i]
- for j = 1,#row do
- fun(row[j])
- end
- end
-
-`indent_substitute` can substitute templates, and in which case they themselves will be substituted using the given table. So in this case, `$t` was substituted twice.
-
-`pl.text` also has a number of useful functions like `dedent`, which strips all the initial indentation from a multiline string. As in Python, this is useful for preprocessing multiline strings if you like indenting them with your code. The function `wrap` is passed a long string (a _paragraph_) and returns a list of lines that fit into a desired line width. As an extension, there is also `indent` for indenting multiline strings.
-
-New in Penlight with the 0.9 series is `text.format_operator`. Calling this enables Python-style string formating using the modulo operator `%`:
-
- > text.format_operator()
- > = '%s[%d]' % {'dog',1}
- dog[1]
-
-So in its simplest form it saves the typing involved with `string.format`; it will also expand `$` variables using named fields:
-
- > = '$animal[$num]' % {animal='dog',num=1}
- dog[1]
-
-<a id="rici_templates"></a>
-
-A new module is `template`, which is a version of Rici Lake's [Lua Preprocessor](http://lua-users.org/wiki/SlightlyLessSimpleLuaPreprocessor). This allows you to mix Lua code with your templates in a straightforward way. There are only two rules:
-
- - Lines begining with `#` are Lua
- - Otherwise, anything inside `$()` is a Lua expression.
-
-So a template generating an HTML list would look like this:
-
- <ul>
- # for i,val in ipairs(T) do
- <li>$(i) = $(val:upper())</li>
- # end
- </ul>
-
-Assume the text is inside `tmpl`, then the template can be expanded using:
-
- local template = require 'pl.template'
- res = template.substitute(tmpl,{T = {'one','two','three'}})
-
-and we get
-
- <ul>
- <li>1 = ONE</li>
- <li>2 = TWO</li>
- <li>3 = THREE</li>
- </ul>
-
-There is a single function, `substitute` which is passed a template string and an environment table. This table may contain some special fields, like `_parent` which can be set to a table representing a 'fallback' environment in case a symbol was not found. `_brackets` is usually '()' and `_escape` is usually '#' but it's sometimes necessary to redefine these if the defaults interfere with the target language - for instance, `$(V)` has another meaning in Make, and `#` means a preprocessor line in C/C++.
-
-Finally, if something goes wrong, passing `_debug` will cause the intermediate Lua code to be dumped if there's a problem.
-
-Here is a C code generation example; something that could easily be extended to be a minimal Lua extension skeleton generator.
-
- local subst = require 'pl.template'.substitute
-
- local templ = [[
- #include <lua.h>
- #include <lauxlib.h>
- #include <lualib.h>
-
- > for _,f in ipairs(mod) do
- static int l_$(f.name) (lua_State *L) {
-
- }
- > end
-
- static const luaL_reg $(mod.name)[] = {
- > for _,f in ipairs(mod) do
- {"$(f.name)",l_$(f.name)},
- > end
- {NULL,NULL}
- };
-
- int luaopen_$(mod.name) {
- luaL_register (L, "$(mod.name)", $(mod.name));
- return 1;
- }
- ]]
-
- print(subst(templ,{
- _escape = '>',
- ipairs = ipairs,
- mod = {
- name = 'baggins';
- {name='frodo'},
- {name='bilbo'}
- }
- }))
-
-(@see text, @see template)
-
-<a id="stringio"></a>
-
-### File-style I/O on Strings
-
-`pl.stringio` provides just three functions; `stringio.open` is passed a string, and returns a file-like object for reading. It supports a `read` method, which takes the same arguments as standard file objects:
-
- > f = stringio.open 'first line\n10 20 30\n'
- > = f:read()
- first line
- > = f:read('*n','*n','*n')
- 10 20 30
-
-`lines` and `seek` are also supported.
-
-`stringio.lines` is a useful short-cut for iterating over all the lines in a string.
-
-`stringio.create` creates a writeable file-like object. You then use `write` to this stream, and finally extract the builded string using `value`. This 'string builder' pattern is useful for efficiently creating large strings.
-
-## Paths and Directories
-
-### Working with Paths
-
-Programs should not depend on quirks of your operating system. They will be harder to read, and need to be ported for other systems. The worst of course is hardcoding paths like 'c:\\' in programs, and wondering why Vista complains so much. But even something like `dir..'\\'..file` is a problem, since Unix can't understand backslashes in this way. `dir..'/'..file` is _usually_ portable, but it's best to put this all into a simple function, `path.join`. If you consistently use `path.join`, then it's much easier to write cross-platform code, since it handles the directory separator for you.
-
-`pl.path` provides the same functionality as Python's `os.path` module (11.1).
-
- > p = 'c:\\bonzo\\DOG.txt'
- > = path.normcase (p) ---> only makes sense on Windows
- c:\bonzo\dog.txt
- > = path.splitext (p)
- c:\bonzo\DOG .txt
- > = path.extension (p)
- .txt
- > = path.basename (p)
- DOG.txt
- > = path.exists(p)
- false
- > = path.join ('fred','alice.txt')
- fred\alice.txt
- > = path.exists 'pretty.lua'
- true
- > = path.getsize 'pretty.lua'
- 2125
- > = path.isfile 'pretty.lua'
- true
- > = path.isdir 'pretty.lua'
- false
-
-
-It is very important for all programmers, not just on Unix, to only write to where they are allowed to write. `path.expanduser` will expand '~' (tilde) into the home directory. Depending on your OS, this will be a guaranteed place where you can create files:
-
- > = path.expanduser '~/mydata.txt'
- 'C:\Documents and Settings\SJDonova/mydata.txt'
-
- > = path.expanduser '~/mydata.txt'
- /home/sdonovan/mydata.txt
-
-Under Windows, `os.tmpname` returns a path which leads to your drive root full of temporary files. (And increasingly, you do not have access to this root folder.) This is corrected by `path.tmpname`, which uses the environment variable TMP:
-
- > os.tmpname() -- not a good place to put temporary files!
- '\s25g.'
- > path.tmpname()
- 'C:\DOCUME~1\SJDonova\LOCALS~1\Temp\s25g.1'
-
-
-A useful extra function is `pl.path.package_path`, which will tell you the path of a particular Lua module. So on my system, `package_path('pl.path')` returns 'C:\Program Files\Lua\5.1\lualibs\pl\path.lua', and `package_path('ifs')` returns 'C:\Program Files\Lua\5.1\clibs\lfs.dll'. It is implemented in terms of `package.searchpath`, which is a new function in Lua 5.2 which has been implemented for Lua 5.1 in Penlight.
-
-### File Operations
-
-`pl.file` is a new module that provides more sensible names for common file operations. For instance, `file.read` and `file.write` are aliases for `utils.readfile` and `utils.writefile`.
-
-Smaller files can be efficiently read and written in one operation. `file.read` is passed a filename and returns the contents as a string, if successful; if not, then it returns `nil` and the actual error message. There is an optional boolean parameter if you want the file to be read in binary mode (this makes no difference on Unix but remains important with Windows.)
-
-In previous versions of Penlight, `utils.readfile` would read standard input if the file was not specified, but this can lead to nasty bugs; use `io.read '*a'` to grab all of standard input.
-
-Similarly, `file.write` takes a filename and a string which will be written to that file.
-
-For example, this little script converts a file into upper case:
-
- require 'pl'
- assert(#arg == 2, 'supply two filenames')
- text = assert(file.read(arg[1]))
- assert(file.write(arg[2],text:upper()))
-
-Copying files is suprisingly tricky. `file.copy` and `file.move` attempt to use the best implementation possible. On Windows, they link to the API functions `CopyFile` and `MoveFile`, but only if the `alien` package is installed (this is true for Lua for Windows.) Otherwise, the system copy command is used. This can be ugly when writing Windows GUI applications, because of the dreaded flashing black-box problem with launching processes.
-
-### Directory Operations
-
-`pl.dir` provides some useful functions for working with directories. `fnmatch` will match a filename against a shell pattern, and `filter` will return any files in the supplied list which match the given pattern, which correspond to the functions in the Python `fnmatch` module. `getdirectories` will return all directories contained in a directory, and `getfiles` will return all files in a directory which match a shell pattern. These functions return the files as a table, unlike `lfs.dir` which returns an iterator.)
-
-`dir.makepath` can create a full path, creating subdirectories as necessary; `rmtree` is the Nuclear Option of file deleting functions, since it will recursively clear out and delete all directories found begining at a path (there is a similar function with this name in the Python `shutils` module.)
-
- > = dir.makepath 't\\temp\\bonzo'
- > = path.isdir 't\\temp\\bonzo'
- true
- > = dir.rmtree 't'
-
-`dir.rmtree` depends on `dir.walk`, which is a powerful tool for scanning a whole directory tree. Here is the implementation of `dir.rmtree`:
-
- --- remove a whole directory tree.
- -- @param path A directory path
- function dir.rmtree(fullpath)
- for root,dirs,files in dir.walk(fullpath) do
- for i,f in ipairs(files) do
- os.remove(path.join(root,f))
- end
- lfs.rmdir(root)
- end
- end
-
-
-`dir.clonetree` clones directory trees. The first argument is a path that must exist, and the second path is the path to be cloned. (Note that this path cannot be _inside_ the first path, since this leads to madness.) By default, it will then just recreate the directory structure. You can in addition provide a function, which will be applied for all files found.
-
- -- make a copy of my libs folder
- require 'pl'
- p1 = [[d:\dev\lua\libs]]
- p2 = [[D:\dev\lua\libs\..\tests]]
- dir.clonetree(p1,p2,dir.copyfile)
-
-A more sophisticated version, which only copies files which have been modified:
-
- -- p1 and p2 as before, or from arg[1] and arg[2]
- dir.clonetree(p1,p2,function(f1,f2)
- local res
- local t1,t2 = path.getmtime(f1),path.getmtime(f2)
- -- f2 might not exist, so be careful about t2
- if not t2 or t1 > t2 then
- res = dir.copyfile(f1,f2)
- end
- return res -- indicates successful operation
- end)
-
-`dir.clonetree` uses `path.common_prefix`. With `p1` and `p2` defined above, the common path is 'd:\dev\lua'. So 'd:\dev\lua\libs\testfunc.lua` is copied to 'd:\dev\lua\test\testfunc.lua', etc.
-
-If you need to find the common path of list of files, then `tablex.reduce` will do the job:
-
- > p3 = [[d:\dev]]
- > = tablex.reduce(path.common_prefix,{p1,p2,p3})
- 'd:\dev'
-
-## Date and Time
-
-<a id="date"></a>
-
-### Manipulating Dates
-
-The `Date` class provides a simplified way to work with [date and time](http://www.lua.org/pil/22.1.html) in Lua; it leans heavily on the functions `os.date` and `os.time`.
-
-A `Date` object can be constructed from a table, just like with `os.time`. Methods are provided to get and set the various parts of the date.
-
- > d = Date {year = 2011, month = 3, day = 2 }
- > = d
- 2011-03-02 12:00:00
- > = d:month(),d:year(),d:day()
- 3 2011 2
- > d:month(4)
- > = d
- 2011-04-02 12:00:00
- > d:add {day=1}
- > = d
- 2011-04-03 12:00:00
-
-`add` takes a table containing one of the date table fields.
-
- > = d:weekday_name()
- Sun
- > = d:last_day()
- 2011-04-30 12:00:00
- > = d:month_name(true)
- April
-
-There is a default conversion to text for date objects, but `Date.Format` gives you full control of the format for both parsing and displaying dates:
-
- > iso = Date.Format 'yyyy-mm-dd'
- > d = iso:parse '2010-04-10'
- > amer = Date.Format 'mm/dd/yyyy'
- > = amer:tostring(d)
- 04/10/2010
-
-With the 0.9.7 relase, the `Date` constructor has become more flexible. You may omit any of the 'year', 'month' or 'day' fields:
-
- > = Date { year = 2008 }
- 2008-01-01 12:00:00
- > = Date { month = 3 }
- 2011-03-01 12:00:00
- > = Date { day = 20 }
- 2011-10-20 12:00:00
- > = Date { hour = 14, min = 30 }
- 2011-10-13 14:30:00
-
-If 'year' is omitted, then the current year is assumed, and likewise for 'month'.
-
-To set the time on such a partial date, you can use the fact that the 'setter' methods return the date object and so you can 'chain' these methods.
-
- > d = Date { day = 03 }
- > = d:hour(18):min(30)
- 2011-10-03 18:30:00
-
-Finally, `Date` also now accepts positional arguments:
-
- > = Date(2011,10,3)
- 2011-10-03 12:00:00
- > = Date(2011,10,3,18,30,23)
- 2011-10-03 18:30:23
-
-`Date.format` has been extended. If you construct an instance without a pattern, then it will try to match against a set of known formats. This is useful for human-input dates since keeping to a strict format is not one of the strong points of users. It assumes that there will be a date, and then a date.
-
- > df = Date.Format()
- > = df:parse '5.30pm'
- 2011-10-13 17:30:00
- > = df:parse '1730'
- nil day out of range: 1730 is not between 1 and 31
- > = df:parse '17.30'
- 2011-10-13 17:30:00
- > = df:parse 'mar'
- 2011-03-01 12:00:00
- > = df:parse '3 March'
- 2011-03-03 12:00:00
- > = df:parse '15 March'
- 2011-03-15 12:00:00
- > = df:parse '15 March 2008'
- 2008-03-15 12:00:00
- > = df:parse '15 March 2008 1.30pm'
- 2008-03-15 13:30:00
- > = df:parse '2008-10-03 15:30:23'
- 2008-10-03 15:30:23
-
-ISO date format is of course a good idea if you need to deal with users from different countries. Here is the default behaviour for 'short' dates:
-
- > = df:parse '24/02/12'
- 2012-02-24 12:00:00
-
-That's not what Americans expect! It's tricky to work out in a cross-platform way exactly what the expected format is, so there is an explicit flag:
-
- > df:US_order(true)
- > = df:parse '9/11/01'
- 2001-11-09 12:00:00
-
-(@see Date)
-
-## Data
-
-### Reading Data Files
-
-The first thing to consider is this: do you actually need to write a custom file reader? And if the answer is yes, the next question is: can you write the reader in as clear a way as possible? Correctness, Robustness, and Speed; pick the first two and the third can be sorted out later, _if necessary_.
-
-A common sort of data file is the configuration file format commonly used on Unix systems. This format is often called a _property_ file in the Java world.
-
- # Read timeout in seconds
- read.timeout=10
-
- # Write timeout in seconds
- write.timeout=10
-
-Here is a simple Lua implementation:
-
- -- property file parsing with Lua string patterns
- props = []
- for line in io.lines() do
- if line:find('#,1,true) ~= 1 and not line:find('^%s*$') then
- local var,value = line:match('([^=]+)=(.*)')
- props[var] = value
- end
- end
-
-Very compact, but it suffers from a similar disease in equivalent Perl programs; it uses odd string patterns which are 'lexically noisy'. Noisy code like this slows the casual reader down. (For an even more direct way of doing this, see the next section, 'Reading Configuration Files')
-
-Another implementation, using the Penlight libraries:
-
- -- property file parsing with extended string functions
- require 'pl'
- stringx.import()
- props = []
- for line in io.lines() do
- if not line:startswith('#') and not line:isspace() then
- local var,value = line:splitv('=')
- props[var] = value
- end
- end
-
-This is more self-documenting; it is generally better to make the code express the _intention_, rather than having to scatter comments everywhere - comments are necessary, of course, but mostly to give the higher view of your intention that cannot be expressed in code. It is slightly slower, true, but in practice the speed of this script is determined by I/O, so further optimization is unnecessary.
-
-### Reading Unstructured Text Data
-
-<a id="input"/>
-
-Text data is sometimes unstructured, for example a file containing words. The 'pl.input` module has a number of functions which makes processing such files easier. For example, a script to count the number of words in standard input (@see input.words):
-
- -- countwords.lua
- require 'pl'
- local k = 1
- for w in input.words(io.stdin) do
- k = k + 1
- end
- print('count',k)
-
-Or this script to calculate the average of a set of numbers (@see input.numbers):
-
- -- average.lua
- require 'pl'
- local k = 1
- local sum = 0
- for n in input.numbers(io.stdin) do
- sum = sum + n
- k = k + 1
- end
- print('average',sum/k)
-
-These scripts can be improved further by _eliminating loops_ In the last case, there is a perfectly good function `seq.sum` which can already take a sequence of numbers and calculate these numbers for us:
-
- -- average2.lua
- require 'pl'
- local total,n = seq.sum(input.numbers())
- print('average',total/n)
-
-A further simplification here is that if `numbers` or `words` are not passed an argument, they will grab their input from standard input. The first script can be rewritten:
-
- -- countwords2.lua
- require 'pl'
- print('count',seq.count(input.words()))
-
-A useful feature of a sequence generator like `numbers` is that it can read from a string source. Here is a script to calculate the sums of the numbers on each line in a file:
-
- -- sums.lua
- for line in io.lines() do
- print(seq.sum(input.numbers(line))
- end
-
-### Reading Columnar Data
-
-It is very common to find data in columnar form, either space or comma-separated, perhaps with an initial set of column headers. Here is a typical example:
-
- EventID Magnitude LocationX LocationY LocationZ
- 981124001 2.0 18988.4 10047.1 4149.7
- 981125001 0.8 19104.0 9970.4 5088.7
- 981127003 0.5 19012.5 9946.9 3831.2
- ...
-
-`input.fields` is designed to extract several columns, given some delimiter (default to whitespace). Here is a script to calculate the average X location of all the events:
-
- -- avg-x.lua
- require 'pl'
- io.read() -- skip the header line
- local sum,count = seq.sum(input.fields {3})
- print(sum/count)
-
-`input.fields` is passed either a field count, or a list of column indices, starting at one as usual. So in this case we're only interested in column 3. If you pass it a field count, then you get every field up to that count:
-
- for id,mag,locX,locY,locZ in input.fields (5) do
- ....
- end
-
-`input.fields` by default tries to convert each field to a number. It will skip lines which clearly don't match the pattern, but will abort the script if there are any fields which cannot be converted to numbers.
-
-The second parameter is a delimiter, by default spaces. ' ' is understood to mean 'any number of spaces', i.e. '%s+'. Any Lua string pattern can be used.
-
-The third parameter is a _data source_, by default standard input (@see input.create_getter) It assumes that the data source has a `read` method which brings in the next line, i.e. it is a 'file-like' object. As a special case, a string will be split into its lines:
-
- > for x,y in input.fields(2,' ','10 20\n30 40\n') do print(x,y) end
- 10 20
- 30 40
-
-Note the default behaviour for bad fields, which is to show the offending line number:
-
- > for x,y in input.fields(2,' ','10 20\n30 40x\n') do print(x,y) end
- 10 20
- line 2: cannot convert '40x' to number
-
-This behaviour of `input.fields` is appropriate for a script which you want to fail immediately with an appropriate _user_ error message if conversion fails. The fourth optional parameter is an options table: `{no_fail=true}` means that conversion is attempted but if it fails it just returns the string, rather as AWK would operate. You are then responsible for checking the type of the returned field. `{no_convert=true}` switches off conversion altogether and all fields are returned as strings.
-
-<a id="data"/>
-
-Sometimes it is useful to bring a whole dataset into memory, for operations such as extracting columns. Penlight provides a flexible reader specifically for reading this kind of data (@see data.read). Given a file looking like this:
-
- x,y
- 10,20
- 2,5
- 40,50
-
-Then `data.read` will create a table like this, with each row represented by a sublist:
-
- > t = data.read 'test.txt'
- > pretty.dump(t)
- {{10,20},{2,5},{40,50},fieldnames={'x','y'},delim=','}
-
-You can now analyze this returned table using the supplied methods. For instance, the method `column_by_name` returns a table of all the values of that column.
-
- -- testdata.lua
- require 'pl'
- d = data.read('fev.txt')
- for _,name in ipairs(d.fieldnames) do
- local col = d:column_by_name(name)
- if type(col[1]) == 'number' then
- local total,n = seq.sum(col)
- utils.printf("Average for %s is %f\n",name,total/n)
- end
- end
-
-`data.read` tries to be clever when given data; by default it expects a first line of column names, unless any of them are numbers. It tries to deduce the column delimiter by looking at the firstline. Sometimes it guesses wrong; these things can be specified explicitly. The second optional parameter is an options table: can override `delim` (a string pattern), `fieldnames` (a list or comma-separated string), specify `no_convert` (default is to convert), numfields (indices of columns known to be numbers, as a list) and `thousands_dot` (when the thousands separator in Excel CSV is '.')
-
-A very powerful feature is a way to execute SQL-like queries on such data:
-
- -- queries on tabular data
- require 'pl'
- local d = data.read('xyz.txt')
- local q = d:select('x,y,z where x > 3 and z < 2 sort by y')
- for x,y,z in q do
- print(x,y,z)
- end
-
-Please note that the format of queries is restricted to the following syntax:
-
- FIELDLIST [ 'where' CONDITION ] [ 'sort by' FIELD [asc|desc]]
-
-Any valid Lua code can appear in `CONDITION`; remember it is _not_ SQL and you have to use `==` (this warning comes from experience.)
-
-For this to work, _field names must be Lua identifiers_. So `read` will massage fieldnames so that all non-alphanumeric chars are replaced with underscores.
-
-`read` can handle standard CSV files fine, although doesn't try to be a full-blown CSV parser. Spreadsheet programs are not always the best tool to process such data, strange as this might seem to some people. This is a toy CSV file; to appreciate the problem, imagine thousands of rows and dozens of columns like this:
-
- Department Name,Employee ID,Project,Hours Booked
- sales,1231,overhead,4
- sales,1255,overhead,3
- engineering,1501,development,5
- engineering,1501,maintenance,3
- engineering,1433,maintenance,10
-
-The task is to reduce the dataset to a relevant set of rows and columns, perhaps do some processing on row data, and write the result out to a new CSV file. The `write_row` method uses the delimiter to write the row to a file; `select_row` is like `select`, except it iterates over _rows_, not fields; this is necessary if we are dealing with a lot of columns!
-
- names = {[1501]='don',[1433]='dilbert'}
- t:write_row (outf,{'Employee','Hours_Booked'})
- q = t:select_row {
- fields=keepcols,
- where=function(row) return row[1]=='engineering' end
- }
- for row in q do
- row[1] = names[row[1]]
- t:write_row(outf,row)
- end
-
-`select_row` and `select` can be passed a table specifying the query; a list of field names, a function defining the condition and an optional parameter `sort_by`. It isn't really necessary here, but if we had a more complicated row condition (such as belonging to a specified set) then it is not generally possible to express such a condition as a query string, without resorting to hackery such as global variables.
-
-Data does not have to come from files, nor does it necessarily come from the lab or the accounts department. On Linux, `ps aux` gives you a full listing of all processes running on your machine. It is straightforward to feed the output of this command into `data.read` and perform useful queries on it. Notice that non-identifier characters like '%' get converted into underscores:
-
- require 'pl'
- List = require 'pl.List'
- f = io.popen 'ps aux'
- s = data.read (f,{last_field_collect=true})
- f:close()
- print(s.fieldnames)
- print(s:column_by_name 'USER')
- qs = 'COMMAND,_MEM where _MEM > 5 and USER=="steve"'
- for name,mem in s:select(qs) do
- print(mem,name)
- end
-
-
-I've always been an admirer of the AWK programming language; with `filter` (@see data.filter) you can get Lua programs which are just as compact:
-
- -- printxy.lua
- require 'pl'
- data.filter 'x,y where x > 3'
-
-It is common enough to have data files without headers of field names. `data.read` makes a special exception for such files if all fields are numeric. Since there are no column names to use in query expressions, you can use AWK-like column indexes, e.g. '$1,$2 where $1 > 3'. I have a little executable script on my system called `lf` which looks like this:
-
- #!/usr/bin/env lua
- require 'pl.data'.filter(arg[1])
-
-And it can be used generally as a filter command to extract columns from data. (The column specifications may be expressions or even constants.)
-
- $ lf '$1,$5/10' < test.dat
-
-(As with AWK, please note the single-quotes used in this command; this prevents the shell trying to expand the column indexes. If you are on Windows, then you are fine, but it is still necessary to quote the expression in double-quotes so it is passed as one argument to your batch file.)
-
-As a tutorial resource, have a look at test-data.lua in the PL tests directory for other examples of use, plus comments.
-
-Finally, for the curious, the global variable `_DEBUG` can be used to print out the actual iterator function which a query generates and dynamically compiles. By using code generation, we can get pretty much optimal performance out of arbitrary queries.
-
- > lua -lpl -e "_DEBUG=true" -e "data.filter 'x,y where x > 4 sort by x'" < test.txt
- return function (t)
- local i = 0
- local v
- local ls = {}
- for i,v in ipairs(t) do
- if v[1] > 4 then
- ls[#ls+1] = v
- end
- end
- table.sort(ls,function(v1,v2)
- return v1[1] < v2[1]
- end)
- local n = #ls
- return function()
- i = i + 1
- v = ls[i]
- if i > n then return end
- return v[1],v[2]
- end
- end
-
- 10,20
- 40,50
-
-<a id="config"/>
-
-### Reading Configuration Files
-
-The `config` module provides a simple way to convert several kinds of configuration files into a Lua table. Consider the simple example:
-
- # test.config
- # Read timeout in seconds
- read.timeout=10
-
- # Write timeout in seconds
- write.timeout=5
-
- #acceptable ports
- ports = 1002,1003,1004
-
-This can be easily brought in using `config.read` and the result shown using `pl.pretty.write` (@see pretty.write)
-
- -- readconfig.lua
- local config = require 'pl.config'
- local pretty= require 'pl.pretty'
-
- local t = config.read(arg[1])
- print(pretty.write(t))
-
-and the output of `lua readconfig.lua test.config` is:
-
- {
- ports = {
- 1002,
- 1003,
- 1004
- },
- write_timeout = 5,
- read_timeout = 10
- }
-
-That is, `config.read()` will bring in all key/value pairs, ignore # comments, and ensure that the key names are proper Lua identifiers by replacing non-identifier characters with '_'. If the values are numbers, then they will be converted. (So the value of `t.write_timeout` is the number 5). In addition, any values which are separated by commas will be converted likewise into an array.
-
-Any line can be continued with a backslash. So this will all be considered one line:
-
- names=one,two,three, \
- four,five,six,seven, \
- eight,nine,ten
-
-
-Windows-style INI files are also supported. The section structure of INI files translates naturally to nested tables in Lua:
-
- ; test.ini
- [timeouts]
- read=10 ; Read timeout in seconds
- write=5 ; Write timeout in seconds
- [portinfo]
- ports = 1002,1003,1004
-
- The output is:
-
- {
- portinfo = {
- ports = {
- 1002,
- 1003,
- 1004
- }
- },
- timeouts = {
- write = 5,
- read = 10
- }
- }
-
-You can now refer to the write timeout as `t.timeouts.write`.
-
-As a final example of the flexibility of `config.read`, if passed this simple comma-delimited file
-
- one,two,three
- 10,20,30
- 40,50,60
- 1,2,3
-
-it will produce the following table:
-
- {
- { "one", "two", "three" },
- { 10, 20, 30 },
- { 40, 50, 60 },
- { 1, 2, 3 }
- }
-
-`config.read` isn't designed to read all CSV files in general, but intended to support some Unix configuration files not structured as key-value pairs, such as '/etc/passwd'.
-
-This function is intended to be a Swiss Army Knife of configuration readers, but it does have to make assumptions, and you may not like them. So there is an optional extra parameter which allows some control, which is table that may have the following fields:
-
- {
- variablilize = true,
- convert_numbers = true,
- trim_space = true,
- list_delim = ','
- }
-
-`variablilize` is the option that converted `write.timeout` in the first example to the valid Lua identifier `write_timeout`. If `convert_numbers` is true, then an attempt is made to convert any string that starts like a number. `trim_space` ensures that there is no starting or trailing whitespace with values, and `list_delim` is the character that will be used to decide whether to split a value up into a list (it may be a Lua string pattern such as '%s+'.)
-
-For instance, the password file in Unix is colon-delimited:
-
- t = config.read('/etc/passwd',{list_delim=':'})
-
-This produces the following output on my system (only last two lines shown):
-
- {
- ...
- {
- "user",
- "x",
- "1000",
- "1000",
- "user,,,",
- "/home/user",
- "/bin/bash"
- },
- {
- "sdonovan",
- "x",
- "1001",
- "1001",
- "steve donovan,28,,",
- "/home/sdonovan",
- "/bin/bash"
- }
- }
-
-You can get this into a more sensible format, where the usernames are the keys, with:
-
- t = tablex.pairmap(function(k,v) return v,v[1] end,t)
-
-and you get:
-
- { ...
- sdonovan = {
- "sdonovan",
- "x",
- "1001",
- "1001",
- "steve donovan,28,,",
- "/home/sdonovan",
- "/bin/bash"
- }
- ...
- }
-
-
-<a id="lexer"/>
-
-### Lexical Scanning
-
-Although Lua's string pattern matching is very powerful, there are times when something more powerful is needed. `pl.lexer.scan` provides lexical scanners which _tokenizes_ a string, classifying tokens into numbers, strings, etc.
-
- > lua -lpl
- Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
- > tok = lexer.scan 'alpha = sin(1.5)'
- > = tok()
- iden alpha
- > = tok()
- = =
- > = tok()
- iden sin
- > = tok()
- ( (
- > = tok()
- number 1.5
- > = tok()
- ) )
- > = tok()
- (nil)
-
-The scanner is a function, which is repeatedly called and returns the _type_ and _value_ of the token. Recognized basic types are 'iden','string','number', and 'space'. and everything else is represented by itself. Note that by default the scanner will skip any 'space' tokens.
-
-'comment' and 'keyword' aren't applicable to the plain scanner, which is not language-specific, but a scanner which understands Lua is available. It recognizes the Lua keywords, and understands both short and long comments and strings.
-
- > for t,v in lexer.lua 'for i=1,n do' do print(t,v) end
- keyword for
- iden i
- = =
- number 1
- , ,
- iden n
- keyword do
-
-A lexical scanner is useful where you have highly-structured data which is not nicely delimited by newlines. For example, here is a snippet of a in-house file format which it was my task to maintain:
-
- points (818344.1,-20389.7,-0.1),(818337.9,-20389.3,-0.1),(818332.5,-20387.8,-0.1)
- ,(818327.4,-20388,-0.1),(818322,-20387.7,-0.1),(818316.3,-20388.6,-0.1)
- ,(818309.7,-20389.4,-0.1),(818303.5,-20390.6,-0.1),(818295.8,-20388.3,-0.1)
- ,(818290.5,-20386.9,-0.1),(818285.2,-20386.1,-0.1),(818279.3,-20383.6,-0.1)
- ,(818274,-20381.2,-0.1),(818274,-20380.7,-0.1);
-
-Here is code to extract the points using `pl.lexer`:
-
- -- assume 's' contains the text above...
- local lexer = require 'pl.lexer'
- local expecting = lexer.expecting
- local append = table.insert
-
- local tok = lexer.scan(s)
-
- local points = {}
- local t,v = tok() -- should be 'iden','points'
-
- while t ~= ';' do
- c = {}
- expecting(tok,'(')
- c.x = expecting(tok,'number')
- expecting(tok,',')
- c.y = expecting(tok,'number')
- expecting(tok,',')
- c.z = expecting(tok,'number')
- expecting(tok,')')
- t,v = tok() -- either ',' or ';'
- append(points,c)
- end
-
-The `expecting` function grabs the next token and if the type doesn't match, it throws an error. (`pl.lexer`, unlike other PL libraries, raises errors if something goes wrong, so you should wrap your code in `pcall` to catch the error gracefully.)
-
-The scanners all have a second optional argument, which is a table which controls whether you want to exclude spaces and/or comments. The default for `lexer.lua` is `{space=true,comments=true}`. There is a third optional argument which determines how string and number tokens are to be processsed.
-
-The ultimate highly-structured data is of course, program source. Here is a snippet from 'text-lexer.lua':
-
- require 'pl'
-
- lines = [[
- for k,v in pairs(t) do
- if type(k) == 'number' then
- print(v) -- array-like case
- else
- print(k,v)
- end
- end
- ]]
-
- ls = List()
- for tp,val in lexer.lua(lines,{space=true,comments=true}) do
- assert(tp ~= 'space' and tp ~= 'comment')
- if tp == 'keyword' then ls:append(val) end
- end
- test.asserteq(ls,List{'for','in','do','if','then','else','end','end'})
-
-Here is a useful little utility that identifies all common global variables found in a lua module:
-
- -- testglobal.lua
- require 'pl'
-
- local txt,err = utils.readfile(arg[1])
- if not txt then return print(err) end
-
- local globals = List()
- for t,v in lexer.lua(txt) do
- if t == 'iden' and _G[v] then
- globals:append(v)
- end
- end
- pretty.dump(seq.count_map(globals))
-
-Rather then dumping the whole list, with its duplicates, we pass it through `seq.count_map` which turns the list into a table where the keys are the values, and the associated values are the number of times those values occur in the sequence. Typical output looks like this:
-
- {
- type = 2,
- pairs = 2,
- table = 2,
- print = 3,
- tostring = 2,
- require = 1,
- ipairs = 4
- }
-
-You could further pass this through `tablex.keys` to get a unique list of symbols. This can be useful when writing 'strict' Lua modules, where all global symbols must be defined as locals at the top of the file.
-
-For a more detailed use of `lexer.scan`, please look at 'testxml.lua' in the examples directory.
-
-### XML
-
-New in the 0.9.7 release is some support for XML. This is a large topic, and Penlight does not provide a full XML stack, which is properly the task of a more specialized library.
-
-#### Parsing and Pretty-Printing
-
-The semi-standard XML parser in the Lua universe is [lua-expat](). In particular, it has a function called `lxp.lom.parse` which will parse XML into the Lua Object Model (LOM) format. However, it does not provide a way to convert this data back into XML text. `xml.parse` will use this function, _if_ `lua-expat` is available, and otherwise switches back to a pure Lua parser originally written by Roberto Ierusalimschy.
-
-The resulting document object knows how to render itself as a string, which is useful for debugging:
-
- > d = xml.parse "<nodes><node id='1'>alice</node></nodes>"
- > = d
- <nodes><node id='1'>alice</node></nodes>
- > pretty.dump (d)
- {
- {
- "alice",
- attr = {
- "id",
- id = "1"
- },
- tag = "node"
- },
- attr = {
- },
- tag = "nodes"
- }
-
-Looking at the actual shape of the data reveals the structure of LOM:
-
- * every element has a `tag` field with its name
- * plus a `attr` field which is a table containing the attributes as fields, and also as an array. It is always present.
- * the children of the element are the array part of the element, so `d[1]` is the first child of `d`, etc.
-
-It could be argued that having attributes also as the array part of `attr` is not essential (you generally cannot depend on attribute order in XML) but that's how it goes with this standard.
-
-`lua-expat` is another _soft dependency_ of Penlight; generally, the fallback parser is good enough for straightforward XML as is commonly found in configuration files, etc. `doc.basic_parse` is not intended to be a proper conforming parser (it's only sixty lines) but it handles simple kinds of documents that do not have comments or DTD directives. It is intelligent enough to ignore the `<?xml` directive and that is about it.
-
-You can get pretty-printing by explicitly calling `xml.tostring` and passing it the initial indent and the per-element indent:
-
- > = xml.tostring(d,'',' ')
-
- <nodes>
- <node id='1'>alice</node>
- </nodes>
-
-There is a fourth argument which is the _attribute indent_:
-
- > a = xml.parse "<frodo name='baggins' age='50' type='hobbit'/>"
- > = xml.tostring(a,'',' ',' ')
-
- <frodo
- type='hobbit'
- name='baggins'
- age='50'
- />
-
-#### Parsing and Working with Configuration Files
-
-It's common to find configurations expressed with XML these days. It's straightforward to 'walk' the LOM data and extract the data in the form you want:
-
- require 'pl'
-
- local config = [[
- <config>
- <alpha>1.3</alpha>
- <beta>10</beta>
- <name>bozo</name>
- </config>
- ]]
- local d,err = xml.parse(config)
-
- local t = {}
- for item in d:childtags() do
- t[item.tag] = item[1]
- end
-
- pretty.dump(t)
- --->
- {
- beta = "10",
- alpha = "1.3",
- name = "bozo"
- }
-
-The only gotcha is that here we must use the `childtags` method, which will skip over any text elements.
-
-A more involved example is this excerpt from `serviceproviders.xml`, which is usually found at `/usr/share/mobile-broadband-provider-info/serviceproviders.xml` on Debian/Ubuntu Linux systems.
-
- d = xml.parse [[
- <serviceproviders format="2.0">
- <country code="za">
- <provider>
- <name>Cell-c</name>
- <gsm>
- <network-id mcc="655" mnc="07"/>
- <apn value="internet">
- <username>Cellcis</username>
- <dns>196.7.0.138</dns>
- <dns>196.7.142.132</dns>
- </apn>
- </gsm>
- </provider>
- <provider>
- <name>MTN</name>
- <gsm>
- <network-id mcc="655" mnc="10"/>
- <apn value="internet">
- <dns>196.11.240.241</dns>
- <dns>209.212.97.1</dns>
- </apn>
- </gsm>
- </provider>
- <provider>
- <name>Vodacom</name>
- <gsm>
- <network-id mcc="655" mnc="01"/>
- <apn value="internet">
- <dns>196.207.40.165</dns>
- <dns>196.43.46.190</dns>
- </apn>
- <apn value="unrestricted">
- <name>Unrestricted</name>
- <dns>196.207.32.69</dns>
- <dns>196.43.45.190</dns>
- </apn>
- </gsm>
- </provider>
- <provider>
- <name>Virgin Mobile</name>
- <gsm>
- <apn value="vdata">
- <dns>196.7.0.138</dns>
- <dns>196.7.142.132</dns>
- </apn>
- </gsm>
- </provider>
- </country>
-
- </serviceproviders>
- ]]
-
-Getting the names of the providers per-country is straightforward:
-
- local t = {}
- for country in d:childtags() do
- local providers = {}
- t[country.tag] = providers
- for provider in country:childtags() do
- table.insert(providers,provider:child_with_name('name'):get_text())
- end
- end
-
- pretty.dump(t)
- -->
- {
- country = {
- "Cell-c",
- "MTN",
- "Vodacom",
- "Virgin Mobile"
- }
- }
-
-#### Generating XML with 'xmlification'
-
-This feature is inspired by the `htmlify` function used by [Orbit](http://keplerproject.github.com/orbit/) to simplify HTML generation, except that no function environment magic is used; the `tags` function returns a set of _constructors_ for elements of the given tag names.
-
- > nodes, node = xml.tags 'nodes, node'
- > = node 'alice'
- <node>alice</node>
- > = nodes { node {id='1','alice'}}
- <nodes><node id='1'>alice</node></nodes>
-
-The flexibility of Lua tables is very useful here, since both the attributes and the children of an element can be encoded naturally. The argument to these tag constructors is either a single value (like a string) or a table where the attributes are the named keys and the children are the array values.
-
-#### Generating XML using Templates
-
-A template is a little XML document which contains dollar-variables. The `subst` method on a document is fed an array of tables containing values for these variables. Note how the parent tag name is specified:
-
- > templ = xml.parse "<node id='$id'>$name</node>"
- > = templ:subst {tag='nodes', {id=1,name='alice'},{id=2,name='john'}}
- <nodes><node id='1'>alice</node><node id='2'>john</node></nodes>
-
-#### Extracting Data using Templates
-
-Matching goes in the opposite direction. We have a document, and would like to extract values from it using a pattern.
-
-A common use of this is parsing the XML result of API queries. The [(undocumented) Google Weather API](http://blog.programmableweb.com/2010/02/08/googles-secret-weather-api/) is a good example. Grabbing the result of `http://www.google.com/ig/api?weather=Johannesburg,ZA" we get something like this, after pretty-printing:
-
- <xml_api_reply version='1'>
- <weather module_id='0' tab_id='0' mobile_zipped='1' section='0' row='0' mobile_row='0'>
- <forecast_information>
- <city data='Johannesburg, Gauteng'/>
- <postal_code data='Johannesburg,ZA'/>
- <latitude_e6 data=''/>
- <longitude_e6 data=''/>
- <forecast_date data='2010-10-02'/>
- <current_date_time data='2010-10-02 18:30:00 +0000'/>
- <unit_system data='US'/>
- </forecast_information>
- <current_conditions>
- <condition data='Clear'/>
- <temp_f data='75'/>
- <temp_c data='24'/>
- <humidity data='Humidity: 19%'/>
- <icon data='/ig/images/weather/sunny.gif'/>
- <wind_condition data='Wind: NW at 7 mph'/>
- </current_conditions>
- <forecast_conditions>
- <day_of_week data='Sat'/>
- <low data='60'/>
- <high data='89'/>
- <icon data='/ig/images/weather/sunny.gif'/>
- <condition data='Clear'/>
- </forecast_conditions>
- ....
- </weather>
- </xml_api_reply>
-
-Assume that the above XML has been read into `google`. The idea is to write a pattern looking like a template, and use it to extract some values of interest:
-
- t = [[
- <weather>
- <current_conditions>
- <condition data='$condition'/>
- <temp_c data='$temp'/>
- </current_conditions>
- </weather>
- ]]
-
- local res, ret = google:match(t)
- pretty.dump(res)
-
-And the output is:
-
- {
- condition = "Clear",
- temp = "24"
- }
-
-The `match` method can be passed a LOM document or some text, which will be parsed first. Note that `$NUMBER` is treated specially as a numerical index, so that `$1` is the first element of the resulting array, etc.
-
-
-## Functional Programming
-
-### Sequences
-
-A Lua iterator (in its simplest form) is a function which can be repeatedly called to return a set of one or more values. The `for in` statement understands these iterators, and loops until the function returns `nil`. There are standard sequence adapters for tables in Lua ('ipairs` and 'pairs'), and `io.lines` returns an iterator over all the lines in a file. In the Penlight libraries, such iterators are also called _sequences_. A sequence of single values (say from `io.lines`) is called _single-valued_, whereas the sequence defined by `pairs` is _double-valued_.
-
-`pl.seq` provides a number of useful iterators, and some functions which operate on sequences. At first sight this example looks like an attempt to write Python in Lua, (with the sequence being inclusive):
-
- > for i in seq.range(1,4) do print(i) end
- 1
- 2
- 3
- 4
-
-But `range` is actually equivalent to Python's `xrange`, since it generates a sequence, not a list. To get a list, use `seq.copy(seq.range(1,10))`, which takes any single-value sequence and makes a table from the result. `seq.list` is like `ipairs` except that it does not give you the index, just the value.
-
- > for x in seq.list {1,2,3} do print(x) end
- 1
- 2
- 3
-
-`enum` takes a sequence and turns it into a double-valued sequence consisting of a sequence number and the value, so `enum(list(ls))` is actually equivalent to `ipairs`. A more interesting example prints out a file with line numbers:
-
- for i,v in seq.enum(io.lines(fname)) do print(i..' '..v) end
-
-Sequences can be _combined_, either by 'zipping' them or by concatenating them.
-
- > for x,y in seq.zip(l1,l2) do print(x,y) end
- 10 1
- 20 2
- 30 3
- > for x in seq.splice(l1,l2) do print(x) end
- 10
- 20
- 30
- 1
- 2
- 3
-
-`seq.printall` is useful for printing out single-valued sequences, and provides some finer control over formating, such as a delimiter, the number of fields per line, and a format string to use (@see string.format)
-
- > seq.printall(seq.random(10))
- 0.0012512588885159 0.56358531449324 0.19330423902097 ....
- > seq.printall(seq.random(10), ',', 4, '%4.2f')
- 0.17,0.86,0.71,0.51
- 0.30,0.01,0.09,0.36
- 0.15,0.17,
-
-`map` will apply a function to a sequence.
-
- > seq.printall(seq.map(string.upper, {'one','two'}))
- ONE TWO
- > seq.printall(seq.map('+', {10,20,30}, 1))
- 11 21 31
-
-`filter` will filter a sequence using a boolean function (often called a _predicate_). For instance, this code only prints lines in a file which are composed of digits:
-
- for l in seq.filter(io.lines(file), stringx.isdigit) do print(l) end
-
-The following returns a table consisting of all the positive values in the original table (equivalent to `tablex.filter(ls, '>', 0)`)
-
- ls = seq.copy(seq.filter(ls, '>', 0))
-
-We're already encounted `seq.sum` when discussing `input.numbers`. This can also be expressed with `seq.reduce`:
-
- > seq.reduce(function(x,y) return x + y end, seq.list{1,2,3,4})
- 10
-
-`seq.reduce` applies a binary function in a recursive fashion, so that:
-
- reduce(op,{1,2,3}) => op(1,reduce(op,{2,3}) => op(1,op(2,3))
-
-it's now possible to easily generate other cumulative operations; the standard operations declared in `pl.operator` are useful here:
-
- > ops = require 'pl.operator'
- > -- can also say '*' instead of ops.mul
- > = seq.reduce(ops.mul,input.numbers '1 2 3 4')
- 24
-
-There are functions to extract statistics from a sequence of numbers:
-
- > l1 = List {10,20,30}
- > l2 = List {1,2,3}
- > = seq.minmax(l1)
- 10 30
- > = seq.sum(l1)
- 60 3
-
-It is common to get sequences where values are repeated, say the words in a file. `count_map` will take such a sequence and count the values, returning a table where the _keys_ are the unique values, and the value associated with each key is the number of times they occurred:
-
- > t = seq.count_map {'one','fred','two','one','two','two'}
- > = t
- {one=2,fred=1,two=3}
-
-This will also work on numerical sequences, but you cannot expect the result to be a proper list, i.e. having no 'holes'. Instead, you always need to use `pairs` to iterate over the result - note that there is a hole at index 5:
-
- > t = seq.count_map {1,2,4,2,2,3,4,2,6}
- > for k,v in pairs(t) do print(k,v) end
- 1 1
- 2 4
- 3 1
- 4 2
- 6 1
-
-`unique` uses `count_map` to return a list of the unique values, that is, just the keys of the resulting table.
-
-`last` turns a single-valued sequence into a double-valued sequence with the current value and the last value:
-
- > for current,last in seq.last {10,20,30,40} do print (current,last) end
- 20 10
- 30 20
- 40 30
-
-This makes it easy to do things like identify repeated lines in a file, or construct differences between values. `filter` can handle double-valued sequences as well, so one could filter such a sequence to only return cases where the current value is less than the last value by using `operator.lt` or just '<'. This code then copies the resulting code into a table.
-
- > ls = {10,9,10,3}
- > = seq.copy(seq.filter(seq.last(s),'<'))
- {9,3}
-
-
-### Sequence Wrappers
-
-The functions in `pl.seq` cover the common patterns when dealing with sequences, but chaining these functions together can lead to ugly code. Consider the last example of the previous section; `seq` is repeated three times and the resulting expression has to be read right-to-left. The first issue can be helped by local aliases, so that the expression becomes `copy(filter(last(s),'<'))` but the second issue refers to the somewhat unnatural order of functional application. We tend to prefer reading operations from left to right, which is one reason why object-oriented notation has become popular. Sequence adapters allow this expression to be written like so:
-
- seq(s):last():filter('<'):copy()
-
-With this notation, the operation becomes a chain of method calls running from left to right.
-
-'Sequence' is not a basic Lua type, they are generally functions or callable objects. The expression `seq(s)` wraps a sequence in a _sequence wrapper_, which is an object which understands all the functions in `pl.seq` as methods. This object then explicitly represents sequences.
-
-As a special case, the constructor (which is when you call the table `seq`) will make a wrapper for a plain list-like table. Here we apply the length operator to a sequence of strings, and print them out.
-
- > seq{'one','tw','t'} :map '#' :printall()
- 3 2 1
-
-As a convenience, there is a function `seq.lines` which behaves just like `io.lines` except it wraps the result as an explicit sequence type. This takes the first 10 lines from standard input, makes it uppercase, turns it into a sequence with a count and the value, glues these together with the concatenation operator, and finally prints out the sequence delimited by a newline.
-
- seq.lines():take(10):upper():enum():map('..'):printall '\n'
-
-Note the method `upper`, which is not a `seq` function. if an unknown method is called, sequence wrappers apply that method to all the values in the sequence (this is implicit use of `mapmethod` - @see seq.mapmethod)
-
-It is straightforward to create custom sequences that can be used in this way. On Unix, `/dev/random` gives you an _endless_ sequence of random bytes, so we use `take` to limit the sequence, and then `map` to scale the result into the desired range. The key step is to use `seq` to wrap the iterator function:
-
- -- random.lua
- local seq = require 'pl.seq'
-
- function dev_random()
- local f = io.open('/dev/random')
- local byte = string.byte
- return seq(function()
- -- read two bytes into a string and convert into a 16-bit number
- local s = f:read(2)
- return byte(s,1) + 256*byte(s,2)
- end)
- end
-
- -- print 10 random numbers from 0 to 1 !
- dev_random():take(10):map('%',100):map('/',100):printall ','
-
-
-Another Linux one-liner depends on the `/proc` filesystem and makes a list of all the currently running processes:
-
- pids = seq(lfs.dir '/proc'):filter(stringx.isdigit):map(tonumber):copy()
-
-This version of Penlight has an experimental feature which relies on the fact that _all_ Lua types can have metatables, including functions. This makes _implicit sequence wrapping_ possible:
-
- > seq.import()
- > seq.random(5):printall(',',5,'%4.1f')
- 0.0, 0.1, 0.4, 0.1, 0.2
-
-This avoids the awkward `seq(seq.random(5))` construction. Or the iterator can come from somewhere else completely:
-
- > ('one two three'):gfind('%a+'):printall(',')
- one,two,three,
-
-After `seq.import()`, it is no longer necessary to explicitly wrap sequence functions.
-
-But there is a price to pay for this convenience. _Every_ function is affected, so that any function can be used, appropriate or not:
-
- > math.sin:printall()
- ..seq.lua:287: bad argument #1 to '(for generator)' (number expected, got nil)
- > a = tostring
- > = a:find(' ')
- function: 0042C920
-
-What function is returned? It's almost certain to be something that makes no sense in the current context. So implicit sequences may make certain kinds of programming mistakes harder to catch - they are best used for interactive exploration and small scripts.
-
-<a id="comprehensions"/>
-
-### List Comprehensions
-
-List comprehensions are a compact way to create tables by specifying their elements. In Python, you can say this:
-
- ls = [x for x in range(5)] # == [0,1,2,3,4]
-
-In Lua, using `pl.comprehension`:
-
- > C = require('pl.comprehension').new()
- > = C ('x for x=1,10') ()
- {1,2,3,4,5,6,7,8,9,10}
-
-`C` is a function which compiles a list comprehension _string_ into a _function_. In this case, the function has no arguments. The parentheses are redundant for a function taking a string argument, so this works as well:
-
- > = C 'x^2 for x=1,4' ()
- {1,4,9,16}
- > = C '{x,x^2} for x=1,4' ()
- {{1,1},{2,4},{3,9},{4,16}}
-
-Note that the expression can be _any_ function of the variable `x`!
-
-The basic syntax so far is `<expr> for <set>`, where `<set>` can be anything that the Lua `for` statement understands. `<set>` can also just be the variable, in which case the values will come from the _argument_ of the comprehension. Here I'm emphasizing that a comprehension is a function which can take a list argument:
-
- > = C '2*x for x' {1,2,3}
- {2,4,6}
- > dbl = C '2*x for x'
- > = dbl {10,20,30}
- {20,40,60}
-
-Here is a somewhat more explicit way of saying the same thing; `_1` is a _placeholder_ refering to the _first_ argument passed to the comprehension.
-
- > = C '2*x for _,x in pairs(_1)' {10,20,30}
- {20,40,60}
- > = C '_1(x) for x'(tostring,{1,2,3,4})
- {'1','2','3','4'}
-
-This extended syntax is useful when you wish to collect the result of some iterator, such as `io.lines`. This comprehension creates a function which creates a table of all the lines in a file:
-
- > f = io.open('array.lua')
- > lines = C 'line for line in _1:lines()' (f)
- > = #lines
- 118
-
-There are a number of functions that may be applied to the result of a comprehension:
-
- > = C 'min(x for x)' {1,44,0}
- 0
- > = C 'max(x for x)' {1,44,0}
- 44
- > = C 'sum(x for x)' {1,44,0}
- 45
-
-(These are equivalent to a reduce operation on a list.)
-
-After the `for` part, there may be a condition, which filters the output. This comprehension collects the even numbers from a list:
-
- > = C 'x for x if x % 2 == 0' {1,2,3,4,5}
- {2,4}
-
-There may be a number of `for` parts:
-
- > = C '{x,y} for x = 1,2 for y = 1,2' ()
- {{1,1},{1,2},{2,1},{2,2}}
- > = C '{x,y} for x for y' ({1,2},{10,20})
- {{1,10},{1,20},{2,10},{2,20}}
-
-These comprehensions are useful when dealing with functions of more than one variable, and are not so easily achieved with the other Penlight functional forms.
-
-<a id="func"/>
-
-### Creating Functions from Functions
-
-Lua functions may be treated like any other value, although of course you cannot multiply or add them. One operation that makes sense is _function composition_, which chains function calls (so `(f * g)(x)` is `f(g(x))`.)
-
- > func = require 'pl.func'
- > printf = func.compose(io.write,string.format)
- > printf("hello %s\n",'world')
- hello world
- true
-
-Many functions require you to pass a function as an argument, say to apply to all values of a sequence or as a callback. Often useful functions have the wrong number of arguments. So there is a need to construct a function of one argument from one of two arguments, _binding_ the extra argument to a given value.
-
-_currying_ takes a function of n arguments and returns a function of n-1 arguments where the first argument is bound to some value:
-
- > p2 = func.curry(print,'start>')
- > p2('hello',2)
- start> hello 2
- > ops = require 'pl.operator'
- > = tablex.filter({1,-2,10,-1,2},curry(ops.gt,0))
- {-2,-1}
- > tablex.filter({1,-2,10,-1,2},curry(ops.le,0))
- {1,10,2}
-
-The last example unfortunately reads backwards, because `curry` alway binds the first argument!
-
-Currying is a specialized form of function argument binding. Here is another way to say the `print` example:
-
- > p2 = func.bind(print,'start>',func._1,func._2)
- > p2('hello',2)
- start> hello 2
-
-where `_1` and `_2` are _placeholder variables_, corresponding to the first and second argument respectively.
-
-Having `func` all over the place is distracting, so it's useful to pull all of `pl.func` into the local context. Here is the filter example, this time the right way around:
-
- > utils.import 'pl.func'
- > tablex.filter({1,-2,10,-1,2},bind(ops.gt, _1, 0))
- {1,10,2}
-
-
-`tablex.merge` does a general merge of two tables. This example shows the usefulness of binding the last argument of a function.
-
- > S1 = {john=27, jane=31, mary=24}
- > S2 = {jane=31, jones=50}
- > intersection = bind(tablex.merge, _1, _2, false)
- > union = bind(tablex.merge, _1, _2, true)
- > = intersection(S1,S2)
- {jane=31}
- > = union(S1,S2)
- {mary=24,jane=31,john=27,jones=50}
-
-When using `bind` to curry `print`, we got a function of precisely two arguments, whereas we really want our function to use varargs like `print`. This is the role of `_0`:
-
- > _DEBUG = true
- > p = bind(print,'start>', _0)
- return function (fn,_v1)
- return function(...) return fn(_v1,...) end
- end
-
- > p(1,2,3,4,5)
- start> 1 2 3 4 5
-
-I've turned on the global `_DEBUG` flag, so that the function generated is printed out. It is actually a function which _generates_ the required function; the first call _binds the value_ of `_v1` to 'start>'.
-
-### Placeholder Expressions
-
-A common pattern in Penlight is a function which applies another function to all elements in a table or a sequence, such as `tablex.map` or `seq.filter`. Lua does anonymous functions well, although they can be a bit tedious to type:
-
- > = tablex.map(function(x) return x*x end, {1,2,3,4})
- {1,4,9,16}
-
-`pl.func` allows you to define _placeholder expressions_, which can cut down on the typing required, and also make your intent clearer. First, we bring contents of `pl.func` into our context, and then supply an expression using placeholder variables, such as `_1`,`_2`,etc. (C++ programmers will recognize this from the Boost libraries.)
-
- > utils.import 'pl.func'
- > = tablex.map(_1*_1, {1,2,3,4})
- {1,4,9,16}
-
-Functions of up to 5 arguments can be generated.
-
- > = tablex.map2(_1+_2,{1,2,3}, {10,20,30})
- {11,22,33}
-
-These expressions can use arbitrary functions, altho they must first be registered with the functional library. `pl.func.register` brings in a single function, and `pl.func.import` brings in a whole table of functions, such as `math`.
-
- > sin = register(math.sin)
- > = tablex.map(sin(_1), {1,2,3,4})
- {0.8414709848079,0.90929742682568,0.14112000805987,-0.75680249530793}
- > import 'math'
- > = tablex.map(cos(2*_1),{1,2,3,4})
- {-0.41614683654714,-0.65364362086361,0.96017028665037,-0.14550003380861}
-
-A common operation is calling a method of a set of objects:
-
- > = tablex.map(_1:sub(1,1), {'one','four','x'})
- {'o','f','x'}
-
-There are some restrictions on what operators can be used in PEs. For instance, because the `__len` metamethod cannot be overriden by plain Lua tables, we need to define a special function to express `#_1':
-
- > = tablex.map(Len(_1), {'one','four','x'})
- {3,4,1}
-
-Likewise for comparison operators, which cannot be overloaded for _different_ types, and thus also have to be expressed as a special function:
-
- > = tablex.filter(Gt(_1,0), {1,-1,2,4,-3})
- {1,2,4}
-
-It is useful to express the fact that a function returns multiple values. For instance, `tablex.pairmap` expects a function that will be called with the key and the value, and returns the new value and the key, in that order.
-
- > = pairmap(Args(_2,_1:upper()),{fred=1,alice=2})
- {ALICE=2,FRED=1}
-
-PEs cannot contain `nil` values, since PE function arguments are represented as an array. Instead, a special value called `Nil` is provided. So say `_1:f(Nil,1)` instead of `_1:f(nil,1)`.
-
-A placeholder expression cannot be automatically used as a Lua function. The technical reason is that the call operator must be overloaded to construct function calls like `_1(1)`. If you want to force a PE to return a function, use `pl.func.I`.
-
- > = tablex.map(_1(10),{I(2*_1),I(_1*_1),I(_1+2)})
- {20,100,12}
-
-Here we make a table of functions taking a single argument, and then call them all with a value of 10.
-
-The essential idea with PEs is to 'quote' an expression so that it is not immediately evaluated, but instead turned into a function that can be applied later to some arguments. The basic mechanism is to wrap values and placeholders so that the usual Lua operators have the effect of building up an _expression tree_. (It turns out that you can do _symbolic algebra_ using PEs, see symbols.lua in the examples directory, and its test runner testsym.lua, which demonstrates symbolic differentiation.)
-
-The rule is that if any operator has a PE operand, the result will be quoted. Sometimes we need to quote things explicitly. For instance, say we want to pass a function to a filter that must return true if the element value is in a set. `set[_1]` is the obvious expression, but it does not give the desired result, since it evaluates directly, giving `nil`. Indexing works differently than a binary operation like addition (set+_1 _is_ properly quoted) so there is a need for an explicit quoting or wrapping operation. This is the job of the `_` function; the PE in this case should be `_(set)[_1]`. This works for functions as well, as a convenient alternative to registering functions: `_(math.sin)(_1)`. This is equivalent to using the `lines' method:
-
- for line in I(_(f):read()) do print(line) end
-
-Now this will work for _any_ 'file-like' object which which has a `read` method returning the next line. If you had a LuaSocket client which was being 'pushed' by lines sent from a server, then `_(s):receive '*l'` would create an iterator for accepting input. These forms can be convenient for adapting your data flow so that it can be passed to the sequence functions in `pl.seq'.
-
-Placeholder expressions can be mixed with sequence wrapper expressions. `lexer.lua` will give us a double-valued sequence of tokens, where the first value is a type, and the second is a value. We filter out only the values where the type is 'iden', extract the actual value using `map`, get the unique values and finally copy to a list.
-
- > str = 'for i=1,10 do for j = 1,10 do print(i,j) end end'
- > = seq(lexer.lua(str)):filter('==','iden'):map(_2):unique():copy()
- {i,print,j}
-
-This is a particularly intense line (and I don't always suggest making everything a one-liner!); the key is the behaviour of `map`, which will take both values of the sequence, so `_2` returns the value part. (Since `filter` here takes extra arguments, it only operates on the type values.)
-
-There are some performance considerations to using placeholder expressions. Instantiating a PE requires constructing and compiling a function, which is not such a fast operation. So to get best performance, factor out PEs from loops like this;
-
- local fn = I(_1:f() + _2:g())
- for i = 1,n do
- res[i] = tablex.map2(fn,first[i],second[i])
- end
-
-
-## Additional Libraries
-
-Libraries in this section are no longer considered to be part of the Penlight core, but still provide specialized functionality when needed.
-
-<a id="sip"/>
-
-### Simple Input Patterns
-
-Lua string pattern matching is very powerful, and usually you will not need a traditional regular expression library. Even so, sometimes Lua code ends up looking like Perl, which happens because string patterns are not always the easiest things to read, especially for the casual reader. Here is a program which needs to understand three distinct date formats:
-
- -- parsing dates using Lua string patterns
- months={Jan=1,Feb=2,Mar=3,Apr=4,May=5,Jun=6,
- Jul=7,Aug=8,Sep=9,Oct=10,Nov=11,Dec=12}
-
- function check_and_process(d,m,y)
- d = tonumber(d)
- m = tonumber(m)
- y = tonumber(y)
- ....
- end
-
- for line in f:lines() do
- -- ordinary (English) date format
- local d,m,y = line:match('(%d+)/(%d+)/(%d+)')
- if d then
- check_and_process(d,m,y)
- else -- ISO date??
- y,m,d = line:match('(%d+)%-(%d+)%-(%d+)')
- if y then
- check_and_process(d,m,y)
- else -- <day> <month-name> <year>?
- d,mm,y = line:match('%(d+)%s+(%a+)%s+(%d+)')
- m = months[mm]
- check_and_process(d,m,y)
- end
- end
- end
-
-These aren't particularly difficult patterns, but already typical issues are appearing, such as having to escape '-'. Also, `string.match` returns its captures, so that we're forced to use a slightly awkward nested if-statement.
-
-Verification issues will further cloud the picture, since regular expression people try to enforce constraints (like year cannot be more than four digits) using regular expressions, on the usual grounds that one shouldn't stop using a hammer when one is enjoying oneself.
-
-`pl.sip` provides a simple, intuitive way to detect patterns in strings and extract relevant parts.
-
- > sip = require 'pl.sip'
- > dump = require('pl.pretty').dump
- > res = {}
- > c = sip.compile 'ref=$S{file}:$d{line}'
- > = c('ref=hello.c:10',res)
- true
- > dump(res)
- {
- line = 10,
- file = "hello.c"
- }
- > = c('ref=long name, no line',res)
- false
-
-`sip.compile` creates a pattern matcher function, which is given a string and a table. If it matches the string, then `true` is returned and the table is populated according to the _named fields_ in the pattern.
-
-Here is another version of the date parser:
-
- -- using SIP patterns
- function check(t)
- check_and_process(t.day,t.month,t.year)
- end
-
- shortdate = sip.compile('$d{day}/$d{month}/$d{year}')
- longdate = sip.compile('$d{day} $v{mon} $d{year}')
- isodate = sip.compile('$d{year}-$d{month}-$d{day}')
-
- for line in f:lines() do
- local res = {}
- if shortdate(str,res) then
- check(res)
- elseif isodate(str,res) then
- check(res)
- elseif longdate(str,res) then
- res.month = months[res.mon]
- check(res)
- end
- end
-
-SIP patterns start with '$', then a one-letter type, and then an optional variable in curly braces.
-
- Type Meaning
- v variable, or identifier.
- i possibly signed integer
- f floating-point number
- r 'rest of line'
- q quoted string (either ' or ")
- p a path name
- ( anything inside (...)
- [ anything inside [...]
- { anything inside {...}
- < anything inside <...>
- [---------------------------------]
- S non-space
- d digits
- ...
-
-If a type is not one of v,i,f,r or q, then it's assumed to be one of the standard Lua character classes. Any spaces you leave in your pattern will match any number of spaces. And any 'magic' string characters will be escaped.
-
-SIP captures (like `$v{mon}`) do not have to be named. You can use just `$v`, but you have to be consistent; if a pattern contains unnamed captures, then all captures must be unnamed. In this case, the result table is a simple list of values.
-
-`sip.match` is a useful shortcut if you like your matches to be 'in place'. (It caches the result, so it is not much slower than explicitly using `sip.compile`.)
-
- > sip.match('($q{first},$q{second})','("john","smith")',res)
- true
- > res
- {second='smith',first='john'}
- > res = {}
- > sip.match('($q,$q)','("jan","smit")',res) -- unnamed captures
- true
- > res
- {'jan','smit'}
- > sip.match('($q,$q)','("jan", "smit")',res)
- false ---> oops! Can't handle extra space!
- > sip.match('( $q , $q )','("jan", "smit")',res)
- true
-
-As a general rule, allow for whitespace in your patterns.
-
-Finally, putting a ' $' at the end of a pattern means 'capture the rest of the line, starting at the first non-space'.
-
- > sip.match('( $q , $q ) $','("jan", "smit") and a string',res)
- true
- > res
- {'jan','smit','and a string'}
- > res = {}
- > sip.match('( $q{first} , $q{last} ) $','("jan", "smit") and a string',res)
- true
- > res
- {first='jan',rest='and a string',last='smit'}
-
-(@see sip)
-
-<a id="lapp"/>
-
-### Command-line Programs with Lapp
-
-`pl.lapp` is a small and focused Lua module which aims to make standard command-line parsing easier and intuitive. It implements the standard GNU style, i.e. short flags with one letter start with '-', and there may be an additional long flag which starts with '--'. Generally options which take an argument expect to find it as the next parameter (e.g. 'gcc test.c -o test') but single short options taking a numerical parameter can dispense with the space (e.g. 'head -n4 test.c')
-
-As far as possible, Lapp will convert parameters into their equivalent Lua types, i.e. convert numbers and convert filenames into file objects. If any conversion fails, or a required parameter is missing, an error will be issued and the usage text will be written out. So there are two necessary tasks, supplying the flag and option names and associating them with a type.
-
-For any non-trivial script, even for personal consumption, it's necessary to supply usage text. The novelty of Lapp is that it starts from that point and defines a loose format for usage strings which can specify the names and types of the parameters.
-
-An example will make this clearer:
-
- -- scale.lua
- lapp = require 'pl.lapp'
- local args = lapp [[
- Does some calculations
- -o,--offset (default 0.0) Offset to add to scaled number
- -s,--scale (number) Scaling factor
- <number> (number ) Number to be scaled
- ]]
-
- print(args.offset + args.scale * args.number)
-
-Here is a command-line session using this script:
-
- $ lua scale.lua
- scale.lua:missing required parameter: scale
-
- Does some calculations
- -o,--offset (default 0.0) Offset to add to scaled number
- -s,--scale (number) Scaling factor
- <number> (number ) Number to be scaled
-
- $ lua scale.lua -s 2.2 10
- 22
-
- $ lua scale.lua -s 2.2 x10
- scale.lua:unable to convert to number: x10
-
- ....(usage as before)
-
-There are two kinds of lines in Lapp usage strings which are meaningful; option and parameter lines. An option line gives the short option, optionally followed by the corresponding long option. A type specifier in parentheses may follow. Similarly, a parameter line starts with '<' PARAMETER '>', followed by a type specifier. Type specifiers are either of the form '(default ' VALUE ')' or '(' TYPE ')'; the default specifier means that the parameter or option has a default value and is not required. TYPE is one of 'string','number','file-in' or 'file-out'; VALUE is a number, one of ('stdin','stdout','stderr') or a token. The rest of the line is not parsed and can be used for explanatory text.
-
-This script shows the relation between the specified parameter names and the fields in the output table.
-
- -- simple.lua
- local args = require ('pl.lapp') [[
- Various flags and option types
- -p A simple optional flag, defaults to false
- -q,--quiet A simple flag with long name
- -o (string) A required option with argument
- <input> (default stdin) Optional input file parameter
- ]]
-
- for k,v in pairs(args) do
- print(k,v)
- end
-
-I've just dumped out all values of the args table; note that args.quiet has become true, because it's specified; args.p defaults to false. If there is a long name for an option, that will be used in preference as a field name. A type or default specifier is not necessary for simple flags, since the default type is boolean.
-
- $ simple -o test -q simple.lua
- p false
- input file (781C1BD8)
- quiet true
- o test
- input_name simple.lua
- D:\dev\lua\lapp>simple -o test simple.lua one two three
- 1 one
- 2 two
- 3 three
- p false
- quiet false
- input file (781C1BD8)
- o test
- input_name simple.lua
-
-The parameter input has been set to an open read-only file object - we know it must be a read-only file since that is the type of the default value. The field input_name is automatically generated, since it's often useful to have access to the original filename.
-
-Notice that any extra parameters supplied will be put in the result table with integer indices, i.e. args[i] where i goes from 1 to #args.
-
-Files don't really have to be closed explicitly for short scripts with a quick well-defined mission, since the result of garbage-collecting file objects is to close them.
-
-#### Enforcing a Range for a Parameter
-
-The type specifier can also be of the form '(' MIN '..' MAX ')'.
-
- local lapp = require 'pl.lapp'
- local args = lapp [[
- Setting ranges
- <x> (1..10) A number from 1 to 10
- <y> (-5..1e6) Bigger range
- ]]
-
- print(args.x,args.y)
-
-Here the meaning is that the value is greater or equal to MIN and less or equal to MAX; there is no provision for forcing a parameter to be a whole number.
-
-You may also define custom types that can be used in the type specifier:
-
- lapp = require ('pl.lapp')
-
- lapp.add_type('integer','number',
- function(x)
- lapp.assert(math.ceil(x) == x, 'not an integer!')
- end
- )
-
- local args = lapp [[
- <ival> (integer) Process PID
- ]]
-
- print(args.ival)
-
-`lapp.add_type` takes three parameters, a type name, a converter and a constraint function. The constraint function is expected to throw an assertion if some condition is not true; we use lapp.assert because it fails in the standard way for a command-line script. The converter argument can either be a type name known to Lapp, or a function which takes a string and generates a value.
-
-#### 'varargs' Parameter Arrays
-
- lapp = require 'pl.lapp'
- local args = lapp [[
- Summing numbers
- <numbers...> (number) A list of numbers to be summed
- ]]
-
- local sum = 0
- for i,x in ipairs(args.numbers) do
- sum = sum + x
- end
- print ('sum is '..sum)
-
-The parameter number has a trailing '...', which indicates that this parameter is a 'varargs' parameter. It must be the last parameter, and args.number will be an array.
-
-Consider this implementation of the head utility from Mac OS X:
-
- -- implements a BSD-style head
- -- (see http://www.manpagez.com/man/1/head/osx-10.3.php)
-
- lapp = require ('pl.lapp')
-
- local args = lapp [[
- Print the first few lines of specified files
- -n (default 10) Number of lines to print
- <files...> (default stdin) Files to print
- ]]
-
- -- by default, lapp converts file arguments to an actual Lua file object.
- -- But the actual filename is always available as <file>_name.
- -- In this case, 'files' is a varargs array, so that 'files_name' is
- -- also an array.
- local nline = args.n
- local nfile = #args.files
- for i = 1,nfile do
- local file = args.files[i]
- if nfile > 1 then
- print('==> '..args.files_name[i]..' <==')
- end
- local n = 0
- for line in file:lines() do
- print(line)
- n = n + 1
- if n == nline then break end
- end
- end
-
-Note how we have access to all the filenames, because the auto-generated field `files_name` is also an array!
-
-(This is probably not a very considerate script, since Lapp will open all the files provided, and only close them at the end of the script. See the xhead.lua example for another implementation.)
-
-Flags and options may also be declared as vararg arrays, and can occur anywhere. Bear in mind that short options can be combined (like 'tar -xzf'), so it's perfectly legal to have '-vvv'. But normally the value of args.v is just a simple `true` value.
-
- local args = require ('pl.lapp') [[
- -v... Verbosity level; can be -v, -vv or -vvv
- ]]
- vlevel = not args.v[1] and 0 or #args.v
- print(vlevel)
-
-The vlevel assigment is a bit of Lua voodoo, so consider the cases:
-
- * No -v flag, v is just { false }
- * One -v flags, v is { true }
- * Two -v flags, v is { true, true }
- * Three -v flags, v is { true, true, true }
-
-#### Defining a Parameter Callback
-
-If a script implements `lapp.callback`, then Lapp will call it after each argument is parsed. The callback is passed the parameter name, the raw unparsed value, and the result table. It is called immediately after assignment of the value, so the corresponding field is available.
-
- lapp = require ('pl.lapp')
-
- function lapp.callback(parm,arg,args)
- print('+',parm,arg)
- end
-
- local args = lapp [[
- Testing parameter handling
- -p Plain flag (defaults to false)
- -q,--quiet Plain flag with GNU-style optional long name
- -o (string) Required string option
- -n (number) Required number option
- -s (default 1.0) Option that takes a number, but will default
- <start> (number) Required number argument
- <input> (default stdin) A parameter which is an input file
- <output> (default stdout) One that is an output file
- ]]
- print 'args'
- for k,v in pairs(args) do
- print(k,v)
- end
-
-This produces the following output:
-
- $ args -o name -n 2 10 args.lua
- + o name
- + n 2
- + start 10
- + input args.lua
- args
- p false
- s 1
- input_name args.lua
- quiet false
- output file (781C1B98)
- start 10
- input file (781C1BD8)
- o name
- n 2
-
-Callbacks are needed when you want to take action immediately on parsing an argument.
-
-
-## Technical Choices
-
-### Modularity and Granularity
-
-In an ideal world, a program should only load the libraries it needs. Penlight is intended to work in situations where an extra 100Kb of bytecode could be a problem. It is straightforward but tedious to load exactly what you need:
-
- local data = require 'pl.data'
- local List = require 'pl.List'
- local array2d = require 'pl.array2d'
- local seq = require 'pl.seq'
- local utils = require 'pl.utils'
-
-This is the style that I follow in Penlight itself, so that modules don't mess with the global environment; also, `stringx.import()` is not used because it will update the global `string` table.
-
-But `require 'pl'` is more convenient in scripts; the question is how to ensure that one doesn't load the whole kitchen sink as the price of convenience. The strategy is to only load modules when they are referenced. In 'init.lua' (which is loaded by `require 'pl'`) a metatable is attached to the global table with an `__index` metamethod. Any unknown name is looked up in the list of modules, and if found, we require it and make that module globally available. So when `tablex.deepcompare` is encountered, looking up `tablex` causes 'pl.tablex' to be required. .
-
-Modifying the behaviour of the global table has consequences. For instance, there is the famous module `strict` which comes with Lua itself (perhaps the only standard Lua module written in Lua itself) which also does this modification so that global variiables must be defined before use. So the implementation in 'init.lua' allows for a 'not found' hook, which 'pl.strict.lua' uses.
-
-But the strategy is worth the effort: the old 'kitchen sink' 'init.lua' would pull in about 260K of bytecode, whereas now typical programs use about 100K less, and short scripts even better - for instance, if they were only needing functionality in `utils`.
-
-There are some functions which mark their output table with a special metatable, when it seems particularly appropriate. For instance, `tablex.makeset` creates a `Set`, and `seq.copy` creates a `List`. But this does not automatically result in the loading of `pl.Set` and `pl.List`; only if you try to access any of these methods. In 'utils.lua', there is an exported table called `stdmt`:
-
- stdmt = { List = {}, Map = {}, Set = {}, MultiMap = {} }
-
-If you go through 'init.lua', then these plain little 'identity' tables get an `__index` metamethod which forces the loading of the full functionality. Here is the code from 'list.lua' which starts the ball rolling for lists:
-
- List = utils.stdmt.List
- List.__index = List
- List._name = "List"
- List._class = List
-
-The 'load-on-demand' strategy helps to modularize the library. Especially for more casual use, `require 'pl'` is a good compromise between convenience and modularity.
-
-In this current version, I have generally reduced the amount of trickery involved. Previously, `Map` was defined in `pl.class`; now it is sensibly defined in `pl.Map`; `pl.class` only contains the basic class mechanism (and returns that function.) For consistency, `List` is returned directly by `require 'pl.List'` (note the uppercase 'L'), Also, the amount of module dependencies in the non-core libraries like `pl.config` have been reduced.
-
-### Defining what is Callable
-
-'utils.lua' exports `function_arg` which is used extensively throughout Penlight. It defines what is meant by 'callable'. Obviously true functions are immediately passed back. But what about strings? The first option is that it represents an operator in 'operator.lua', so that '<' is just an alias for `operator.lt`.
-
-We then check whether there is a _function factory_ defined for the metatable of the value.
-
-(It is true that strings can be made callable, but in practice this turns out to be a cute but dubious idea, since _all_ strings share the same metatable. A common programming error is to pass the wrong kind of object to a function, and it's better to get a nice clean 'attempting to call a string' message rather than some obscure trace from the bowels of your library.)
-
-The other module that registers a function factory is `pl.func`. Placeholder expressions cannot be directly calleable, and so need to be instantiated and cached in as efficient way as possible.
-
-(An inconsistency is that `utils.is_callable` does not do this thorough check.)
-
-
diff --git a/docs/readme.md b/docs/readme.md
deleted file mode 100644
index ddfb6ae..0000000
--- a/docs/readme.md
+++ /dev/null
@@ -1,9 +0,0 @@
-The docgen.lua script generates the HTML documentation from penlight.md,
-using a custom version of markdown.lua which does ToC generation.
-
-This script also finds @see references and resolves them.
-
-The API documentation requires LuaDoc - run from api folder.
-
-
-