mortar-pygments.rb 0.5.5

Sign up to get free protection for your applications and to get access to all the features.
Files changed (472) hide show
  1. data/.gitignore +6 -0
  2. data/CHANGELOG.md +71 -0
  3. data/Gemfile +2 -0
  4. data/LICENSE +17 -0
  5. data/README.md +121 -0
  6. data/Rakefile +66 -0
  7. data/bench.rb +22 -0
  8. data/cache-lexers.rb +8 -0
  9. data/lexers +0 -0
  10. data/lib/pygments/lexer.rb +148 -0
  11. data/lib/pygments/mentos.py +351 -0
  12. data/lib/pygments/popen.rb +406 -0
  13. data/lib/pygments/version.rb +3 -0
  14. data/lib/pygments.rb +8 -0
  15. data/pygments.rb.gemspec +25 -0
  16. data/test/test_data.c +2581 -0
  17. data/test/test_data.py +514 -0
  18. data/test/test_data_generated +2582 -0
  19. data/test/test_pygments.rb +287 -0
  20. data/vendor/custom_lexers/github.py +565 -0
  21. data/vendor/pygments-main/AUTHORS +153 -0
  22. data/vendor/pygments-main/CHANGES +889 -0
  23. data/vendor/pygments-main/LICENSE +25 -0
  24. data/vendor/pygments-main/MANIFEST.in +6 -0
  25. data/vendor/pygments-main/Makefile +59 -0
  26. data/vendor/pygments-main/REVISION +1 -0
  27. data/vendor/pygments-main/TODO +15 -0
  28. data/vendor/pygments-main/docs/generate.py +472 -0
  29. data/vendor/pygments-main/docs/pygmentize.1 +94 -0
  30. data/vendor/pygments-main/docs/src/api.txt +270 -0
  31. data/vendor/pygments-main/docs/src/authors.txt +5 -0
  32. data/vendor/pygments-main/docs/src/changelog.txt +5 -0
  33. data/vendor/pygments-main/docs/src/cmdline.txt +147 -0
  34. data/vendor/pygments-main/docs/src/filterdevelopment.txt +70 -0
  35. data/vendor/pygments-main/docs/src/filters.txt +42 -0
  36. data/vendor/pygments-main/docs/src/formatterdevelopment.txt +169 -0
  37. data/vendor/pygments-main/docs/src/formatters.txt +48 -0
  38. data/vendor/pygments-main/docs/src/index.txt +69 -0
  39. data/vendor/pygments-main/docs/src/installation.txt +71 -0
  40. data/vendor/pygments-main/docs/src/integrate.txt +48 -0
  41. data/vendor/pygments-main/docs/src/java.txt +70 -0
  42. data/vendor/pygments-main/docs/src/lexerdevelopment.txt +603 -0
  43. data/vendor/pygments-main/docs/src/lexers.txt +67 -0
  44. data/vendor/pygments-main/docs/src/moinmoin.txt +39 -0
  45. data/vendor/pygments-main/docs/src/plugins.txt +93 -0
  46. data/vendor/pygments-main/docs/src/quickstart.txt +202 -0
  47. data/vendor/pygments-main/docs/src/rstdirective.txt +22 -0
  48. data/vendor/pygments-main/docs/src/styles.txt +143 -0
  49. data/vendor/pygments-main/docs/src/tokens.txt +349 -0
  50. data/vendor/pygments-main/docs/src/unicode.txt +49 -0
  51. data/vendor/pygments-main/external/autopygmentize +64 -0
  52. data/vendor/pygments-main/external/lasso-builtins-generator-9.lasso +144 -0
  53. data/vendor/pygments-main/external/markdown-processor.py +67 -0
  54. data/vendor/pygments-main/external/moin-parser.py +112 -0
  55. data/vendor/pygments-main/external/pygments.bashcomp +38 -0
  56. data/vendor/pygments-main/external/rst-directive-old.py +77 -0
  57. data/vendor/pygments-main/external/rst-directive.py +83 -0
  58. data/vendor/pygments-main/ez_setup.py +276 -0
  59. data/vendor/pygments-main/pygmentize +7 -0
  60. data/vendor/pygments-main/pygments/__init__.py +91 -0
  61. data/vendor/pygments-main/pygments/cmdline.py +441 -0
  62. data/vendor/pygments-main/pygments/console.py +74 -0
  63. data/vendor/pygments-main/pygments/filter.py +74 -0
  64. data/vendor/pygments-main/pygments/filters/__init__.py +356 -0
  65. data/vendor/pygments-main/pygments/formatter.py +95 -0
  66. data/vendor/pygments-main/pygments/formatters/__init__.py +68 -0
  67. data/vendor/pygments-main/pygments/formatters/_mapping.py +92 -0
  68. data/vendor/pygments-main/pygments/formatters/bbcode.py +109 -0
  69. data/vendor/pygments-main/pygments/formatters/html.py +821 -0
  70. data/vendor/pygments-main/pygments/formatters/img.py +553 -0
  71. data/vendor/pygments-main/pygments/formatters/latex.py +378 -0
  72. data/vendor/pygments-main/pygments/formatters/other.py +115 -0
  73. data/vendor/pygments-main/pygments/formatters/rtf.py +136 -0
  74. data/vendor/pygments-main/pygments/formatters/svg.py +154 -0
  75. data/vendor/pygments-main/pygments/formatters/terminal.py +112 -0
  76. data/vendor/pygments-main/pygments/formatters/terminal256.py +222 -0
  77. data/vendor/pygments-main/pygments/lexer.py +765 -0
  78. data/vendor/pygments-main/pygments/lexers/__init__.py +240 -0
  79. data/vendor/pygments-main/pygments/lexers/_asybuiltins.py +1645 -0
  80. data/vendor/pygments-main/pygments/lexers/_clbuiltins.py +232 -0
  81. data/vendor/pygments-main/pygments/lexers/_lassobuiltins.py +5172 -0
  82. data/vendor/pygments-main/pygments/lexers/_luabuiltins.py +249 -0
  83. data/vendor/pygments-main/pygments/lexers/_mapping.py +355 -0
  84. data/vendor/pygments-main/pygments/lexers/_openedgebuiltins.py +562 -0
  85. data/vendor/pygments-main/pygments/lexers/_phpbuiltins.py +3787 -0
  86. data/vendor/pygments-main/pygments/lexers/_postgres_builtins.py +233 -0
  87. data/vendor/pygments-main/pygments/lexers/_robotframeworklexer.py +557 -0
  88. data/vendor/pygments-main/pygments/lexers/_scilab_builtins.py +40 -0
  89. data/vendor/pygments-main/pygments/lexers/_sourcemodbuiltins.py +1072 -0
  90. data/vendor/pygments-main/pygments/lexers/_stan_builtins.py +360 -0
  91. data/vendor/pygments-main/pygments/lexers/_vimbuiltins.py +13 -0
  92. data/vendor/pygments-main/pygments/lexers/agile.py +2290 -0
  93. data/vendor/pygments-main/pygments/lexers/asm.py +398 -0
  94. data/vendor/pygments-main/pygments/lexers/compiled.py +3723 -0
  95. data/vendor/pygments-main/pygments/lexers/dalvik.py +104 -0
  96. data/vendor/pygments-main/pygments/lexers/dotnet.py +671 -0
  97. data/vendor/pygments-main/pygments/lexers/foxpro.py +428 -0
  98. data/vendor/pygments-main/pygments/lexers/functional.py +2731 -0
  99. data/vendor/pygments-main/pygments/lexers/github.py +565 -0
  100. data/vendor/pygments-main/pygments/lexers/hdl.py +356 -0
  101. data/vendor/pygments-main/pygments/lexers/jvm.py +1112 -0
  102. data/vendor/pygments-main/pygments/lexers/math.py +1918 -0
  103. data/vendor/pygments-main/pygments/lexers/other.py +3778 -0
  104. data/vendor/pygments-main/pygments/lexers/parsers.py +778 -0
  105. data/vendor/pygments-main/pygments/lexers/shell.py +424 -0
  106. data/vendor/pygments-main/pygments/lexers/special.py +100 -0
  107. data/vendor/pygments-main/pygments/lexers/sql.py +609 -0
  108. data/vendor/pygments-main/pygments/lexers/templates.py +1742 -0
  109. data/vendor/pygments-main/pygments/lexers/text.py +1893 -0
  110. data/vendor/pygments-main/pygments/lexers/web.py +4045 -0
  111. data/vendor/pygments-main/pygments/modeline.py +40 -0
  112. data/vendor/pygments-main/pygments/plugin.py +74 -0
  113. data/vendor/pygments-main/pygments/scanner.py +104 -0
  114. data/vendor/pygments-main/pygments/style.py +117 -0
  115. data/vendor/pygments-main/pygments/styles/__init__.py +70 -0
  116. data/vendor/pygments-main/pygments/styles/autumn.py +65 -0
  117. data/vendor/pygments-main/pygments/styles/borland.py +51 -0
  118. data/vendor/pygments-main/pygments/styles/bw.py +49 -0
  119. data/vendor/pygments-main/pygments/styles/colorful.py +81 -0
  120. data/vendor/pygments-main/pygments/styles/default.py +73 -0
  121. data/vendor/pygments-main/pygments/styles/emacs.py +72 -0
  122. data/vendor/pygments-main/pygments/styles/friendly.py +72 -0
  123. data/vendor/pygments-main/pygments/styles/fruity.py +42 -0
  124. data/vendor/pygments-main/pygments/styles/manni.py +75 -0
  125. data/vendor/pygments-main/pygments/styles/monokai.py +106 -0
  126. data/vendor/pygments-main/pygments/styles/murphy.py +80 -0
  127. data/vendor/pygments-main/pygments/styles/native.py +65 -0
  128. data/vendor/pygments-main/pygments/styles/pastie.py +75 -0
  129. data/vendor/pygments-main/pygments/styles/perldoc.py +69 -0
  130. data/vendor/pygments-main/pygments/styles/rrt.py +33 -0
  131. data/vendor/pygments-main/pygments/styles/tango.py +141 -0
  132. data/vendor/pygments-main/pygments/styles/trac.py +63 -0
  133. data/vendor/pygments-main/pygments/styles/vim.py +63 -0
  134. data/vendor/pygments-main/pygments/styles/vs.py +38 -0
  135. data/vendor/pygments-main/pygments/token.py +195 -0
  136. data/vendor/pygments-main/pygments/unistring.py +140 -0
  137. data/vendor/pygments-main/pygments/util.py +277 -0
  138. data/vendor/pygments-main/scripts/check_sources.py +242 -0
  139. data/vendor/pygments-main/scripts/detect_missing_analyse_text.py +32 -0
  140. data/vendor/pygments-main/scripts/epydoc.css +280 -0
  141. data/vendor/pygments-main/scripts/find_codetags.py +205 -0
  142. data/vendor/pygments-main/scripts/find_error.py +170 -0
  143. data/vendor/pygments-main/scripts/get_vimkw.py +43 -0
  144. data/vendor/pygments-main/scripts/pylintrc +301 -0
  145. data/vendor/pygments-main/scripts/reindent.py +291 -0
  146. data/vendor/pygments-main/scripts/vim2pygments.py +933 -0
  147. data/vendor/pygments-main/setup.cfg +7 -0
  148. data/vendor/pygments-main/setup.py +90 -0
  149. data/vendor/pygments-main/tests/dtds/HTML4-f.dtd +37 -0
  150. data/vendor/pygments-main/tests/dtds/HTML4-s.dtd +869 -0
  151. data/vendor/pygments-main/tests/dtds/HTML4.dcl +88 -0
  152. data/vendor/pygments-main/tests/dtds/HTML4.dtd +1092 -0
  153. data/vendor/pygments-main/tests/dtds/HTML4.soc +9 -0
  154. data/vendor/pygments-main/tests/dtds/HTMLlat1.ent +195 -0
  155. data/vendor/pygments-main/tests/dtds/HTMLspec.ent +77 -0
  156. data/vendor/pygments-main/tests/dtds/HTMLsym.ent +241 -0
  157. data/vendor/pygments-main/tests/examplefiles/ANTLRv3.g +608 -0
  158. data/vendor/pygments-main/tests/examplefiles/AcidStateAdvanced.hs +209 -0
  159. data/vendor/pygments-main/tests/examplefiles/AlternatingGroup.mu +102 -0
  160. data/vendor/pygments-main/tests/examplefiles/BOM.js +1 -0
  161. data/vendor/pygments-main/tests/examplefiles/CPDictionary.j +611 -0
  162. data/vendor/pygments-main/tests/examplefiles/Config.in.cache +1973 -0
  163. data/vendor/pygments-main/tests/examplefiles/Constants.mo +158 -0
  164. data/vendor/pygments-main/tests/examplefiles/DancingSudoku.lhs +411 -0
  165. data/vendor/pygments-main/tests/examplefiles/Deflate.fs +578 -0
  166. data/vendor/pygments-main/tests/examplefiles/Errors.scala +18 -0
  167. data/vendor/pygments-main/tests/examplefiles/File.hy +174 -0
  168. data/vendor/pygments-main/tests/examplefiles/Get-CommandDefinitionHtml.ps1 +66 -0
  169. data/vendor/pygments-main/tests/examplefiles/IPDispatchC.nc +104 -0
  170. data/vendor/pygments-main/tests/examplefiles/IPDispatchP.nc +671 -0
  171. data/vendor/pygments-main/tests/examplefiles/Intro.java +1660 -0
  172. data/vendor/pygments-main/tests/examplefiles/Makefile +1131 -0
  173. data/vendor/pygments-main/tests/examplefiles/Object.st +4394 -0
  174. data/vendor/pygments-main/tests/examplefiles/OrderedMap.hx +584 -0
  175. data/vendor/pygments-main/tests/examplefiles/RoleQ.pm6 +23 -0
  176. data/vendor/pygments-main/tests/examplefiles/SmallCheck.hs +378 -0
  177. data/vendor/pygments-main/tests/examplefiles/Sorting.mod +470 -0
  178. data/vendor/pygments-main/tests/examplefiles/Sudoku.lhs +382 -0
  179. data/vendor/pygments-main/tests/examplefiles/addressbook.proto +30 -0
  180. data/vendor/pygments-main/tests/examplefiles/antlr_throws +1 -0
  181. data/vendor/pygments-main/tests/examplefiles/apache2.conf +393 -0
  182. data/vendor/pygments-main/tests/examplefiles/as3_test.as +143 -0
  183. data/vendor/pygments-main/tests/examplefiles/as3_test2.as +46 -0
  184. data/vendor/pygments-main/tests/examplefiles/as3_test3.as +3 -0
  185. data/vendor/pygments-main/tests/examplefiles/aspx-cs_example +27 -0
  186. data/vendor/pygments-main/tests/examplefiles/autoit_submit.au3 +25 -0
  187. data/vendor/pygments-main/tests/examplefiles/badcase.java +2 -0
  188. data/vendor/pygments-main/tests/examplefiles/batchfile.bat +49 -0
  189. data/vendor/pygments-main/tests/examplefiles/bigtest.nsi +308 -0
  190. data/vendor/pygments-main/tests/examplefiles/boot-9.scm +1557 -0
  191. data/vendor/pygments-main/tests/examplefiles/ca65_example +284 -0
  192. data/vendor/pygments-main/tests/examplefiles/cbmbas_example +9 -0
  193. data/vendor/pygments-main/tests/examplefiles/cells.ps +515 -0
  194. data/vendor/pygments-main/tests/examplefiles/ceval.c +2604 -0
  195. data/vendor/pygments-main/tests/examplefiles/cheetah_example.html +13 -0
  196. data/vendor/pygments-main/tests/examplefiles/classes.dylan +125 -0
  197. data/vendor/pygments-main/tests/examplefiles/condensed_ruby.rb +10 -0
  198. data/vendor/pygments-main/tests/examplefiles/coq_RelationClasses +447 -0
  199. data/vendor/pygments-main/tests/examplefiles/database.pytb +20 -0
  200. data/vendor/pygments-main/tests/examplefiles/de.MoinMoin.po +2461 -0
  201. data/vendor/pygments-main/tests/examplefiles/demo.ahk +181 -0
  202. data/vendor/pygments-main/tests/examplefiles/demo.cfm +38 -0
  203. data/vendor/pygments-main/tests/examplefiles/django_sample.html+django +68 -0
  204. data/vendor/pygments-main/tests/examplefiles/dwarf.cw +17 -0
  205. data/vendor/pygments-main/tests/examplefiles/erl_session +10 -0
  206. data/vendor/pygments-main/tests/examplefiles/escape_semicolon.clj +1 -0
  207. data/vendor/pygments-main/tests/examplefiles/evil_regex.js +48 -0
  208. data/vendor/pygments-main/tests/examplefiles/example.Rd +78 -0
  209. data/vendor/pygments-main/tests/examplefiles/example.bug +54 -0
  210. data/vendor/pygments-main/tests/examplefiles/example.c +2080 -0
  211. data/vendor/pygments-main/tests/examplefiles/example.ceylon +52 -0
  212. data/vendor/pygments-main/tests/examplefiles/example.clay +33 -0
  213. data/vendor/pygments-main/tests/examplefiles/example.cls +15 -0
  214. data/vendor/pygments-main/tests/examplefiles/example.cob +3556 -0
  215. data/vendor/pygments-main/tests/examplefiles/example.cpp +2363 -0
  216. data/vendor/pygments-main/tests/examplefiles/example.gs +106 -0
  217. data/vendor/pygments-main/tests/examplefiles/example.gst +7 -0
  218. data/vendor/pygments-main/tests/examplefiles/example.hx +142 -0
  219. data/vendor/pygments-main/tests/examplefiles/example.jag +48 -0
  220. data/vendor/pygments-main/tests/examplefiles/example.kt +47 -0
  221. data/vendor/pygments-main/tests/examplefiles/example.lagda +19 -0
  222. data/vendor/pygments-main/tests/examplefiles/example.lua +250 -0
  223. data/vendor/pygments-main/tests/examplefiles/example.monkey +152 -0
  224. data/vendor/pygments-main/tests/examplefiles/example.moo +26 -0
  225. data/vendor/pygments-main/tests/examplefiles/example.moon +629 -0
  226. data/vendor/pygments-main/tests/examplefiles/example.msc +43 -0
  227. data/vendor/pygments-main/tests/examplefiles/example.nim +1010 -0
  228. data/vendor/pygments-main/tests/examplefiles/example.ns2 +69 -0
  229. data/vendor/pygments-main/tests/examplefiles/example.p +34 -0
  230. data/vendor/pygments-main/tests/examplefiles/example.pas +2708 -0
  231. data/vendor/pygments-main/tests/examplefiles/example.prg +161 -0
  232. data/vendor/pygments-main/tests/examplefiles/example.rb +1852 -0
  233. data/vendor/pygments-main/tests/examplefiles/example.reg +19 -0
  234. data/vendor/pygments-main/tests/examplefiles/example.rexx +50 -0
  235. data/vendor/pygments-main/tests/examplefiles/example.rhtml +561 -0
  236. data/vendor/pygments-main/tests/examplefiles/example.rkt +95 -0
  237. data/vendor/pygments-main/tests/examplefiles/example.rpf +4 -0
  238. data/vendor/pygments-main/tests/examplefiles/example.sh-session +19 -0
  239. data/vendor/pygments-main/tests/examplefiles/example.shell-session +45 -0
  240. data/vendor/pygments-main/tests/examplefiles/example.sml +156 -0
  241. data/vendor/pygments-main/tests/examplefiles/example.snobol +15 -0
  242. data/vendor/pygments-main/tests/examplefiles/example.stan +108 -0
  243. data/vendor/pygments-main/tests/examplefiles/example.tea +34 -0
  244. data/vendor/pygments-main/tests/examplefiles/example.ts +28 -0
  245. data/vendor/pygments-main/tests/examplefiles/example.u +548 -0
  246. data/vendor/pygments-main/tests/examplefiles/example.weechatlog +9 -0
  247. data/vendor/pygments-main/tests/examplefiles/example.xhtml +376 -0
  248. data/vendor/pygments-main/tests/examplefiles/example.xtend +34 -0
  249. data/vendor/pygments-main/tests/examplefiles/example.yaml +302 -0
  250. data/vendor/pygments-main/tests/examplefiles/example2.aspx +29 -0
  251. data/vendor/pygments-main/tests/examplefiles/example2.msc +79 -0
  252. data/vendor/pygments-main/tests/examplefiles/example_elixir.ex +363 -0
  253. data/vendor/pygments-main/tests/examplefiles/example_file.fy +128 -0
  254. data/vendor/pygments-main/tests/examplefiles/firefox.mak +586 -0
  255. data/vendor/pygments-main/tests/examplefiles/flipflop.sv +19 -0
  256. data/vendor/pygments-main/tests/examplefiles/foo.sce +6 -0
  257. data/vendor/pygments-main/tests/examplefiles/format.ml +1213 -0
  258. data/vendor/pygments-main/tests/examplefiles/fucked_up.rb +77 -0
  259. data/vendor/pygments-main/tests/examplefiles/function.mu +1 -0
  260. data/vendor/pygments-main/tests/examplefiles/functional.rst +1472 -0
  261. data/vendor/pygments-main/tests/examplefiles/garcia-wachs.kk +133 -0
  262. data/vendor/pygments-main/tests/examplefiles/genclass.clj +510 -0
  263. data/vendor/pygments-main/tests/examplefiles/genshi_example.xml+genshi +193 -0
  264. data/vendor/pygments-main/tests/examplefiles/genshitext_example.genshitext +33 -0
  265. data/vendor/pygments-main/tests/examplefiles/glsl.frag +7 -0
  266. data/vendor/pygments-main/tests/examplefiles/glsl.vert +13 -0
  267. data/vendor/pygments-main/tests/examplefiles/grammar-test.p6 +22 -0
  268. data/vendor/pygments-main/tests/examplefiles/hello.smali +40 -0
  269. data/vendor/pygments-main/tests/examplefiles/hello.sp +9 -0
  270. data/vendor/pygments-main/tests/examplefiles/html+php_faulty.php +1 -0
  271. data/vendor/pygments-main/tests/examplefiles/http_request_example +15 -0
  272. data/vendor/pygments-main/tests/examplefiles/http_response_example +29 -0
  273. data/vendor/pygments-main/tests/examplefiles/import.hs +4 -0
  274. data/vendor/pygments-main/tests/examplefiles/inet_pton6.dg +71 -0
  275. data/vendor/pygments-main/tests/examplefiles/intro.ik +24 -0
  276. data/vendor/pygments-main/tests/examplefiles/ints.php +10 -0
  277. data/vendor/pygments-main/tests/examplefiles/intsyn.fun +675 -0
  278. data/vendor/pygments-main/tests/examplefiles/intsyn.sig +286 -0
  279. data/vendor/pygments-main/tests/examplefiles/irb_heredoc +8 -0
  280. data/vendor/pygments-main/tests/examplefiles/irc.lsp +214 -0
  281. data/vendor/pygments-main/tests/examplefiles/java.properties +16 -0
  282. data/vendor/pygments-main/tests/examplefiles/jbst_example1.jbst +28 -0
  283. data/vendor/pygments-main/tests/examplefiles/jbst_example2.jbst +45 -0
  284. data/vendor/pygments-main/tests/examplefiles/jinjadesignerdoc.rst +713 -0
  285. data/vendor/pygments-main/tests/examplefiles/json.lasso +301 -0
  286. data/vendor/pygments-main/tests/examplefiles/json.lasso9 +213 -0
  287. data/vendor/pygments-main/tests/examplefiles/lighttpd_config.conf +13 -0
  288. data/vendor/pygments-main/tests/examplefiles/linecontinuation.py +47 -0
  289. data/vendor/pygments-main/tests/examplefiles/livescript-demo.ls +41 -0
  290. data/vendor/pygments-main/tests/examplefiles/logos_example.xm +28 -0
  291. data/vendor/pygments-main/tests/examplefiles/ltmain.sh +2849 -0
  292. data/vendor/pygments-main/tests/examplefiles/main.cmake +42 -0
  293. data/vendor/pygments-main/tests/examplefiles/markdown.lsp +679 -0
  294. data/vendor/pygments-main/tests/examplefiles/matlab_noreturn +3 -0
  295. data/vendor/pygments-main/tests/examplefiles/matlab_sample +30 -0
  296. data/vendor/pygments-main/tests/examplefiles/matlabsession_sample.txt +37 -0
  297. data/vendor/pygments-main/tests/examplefiles/metagrammar.treetop +455 -0
  298. data/vendor/pygments-main/tests/examplefiles/mg_sample.pro +73 -0
  299. data/vendor/pygments-main/tests/examplefiles/minehunt.qml +112 -0
  300. data/vendor/pygments-main/tests/examplefiles/minimal.ns2 +4 -0
  301. data/vendor/pygments-main/tests/examplefiles/moin_SyntaxReference.txt +340 -0
  302. data/vendor/pygments-main/tests/examplefiles/multiline_regexes.rb +38 -0
  303. data/vendor/pygments-main/tests/examplefiles/nanomsg.intr +95 -0
  304. data/vendor/pygments-main/tests/examplefiles/nasm_aoutso.asm +96 -0
  305. data/vendor/pygments-main/tests/examplefiles/nasm_objexe.asm +30 -0
  306. data/vendor/pygments-main/tests/examplefiles/nemerle_sample.n +87 -0
  307. data/vendor/pygments-main/tests/examplefiles/nginx_nginx.conf +118 -0
  308. data/vendor/pygments-main/tests/examplefiles/numbers.c +12 -0
  309. data/vendor/pygments-main/tests/examplefiles/objc_example.m +32 -0
  310. data/vendor/pygments-main/tests/examplefiles/objc_example2.m +24 -0
  311. data/vendor/pygments-main/tests/examplefiles/perl_misc +62 -0
  312. data/vendor/pygments-main/tests/examplefiles/perl_perl5db +998 -0
  313. data/vendor/pygments-main/tests/examplefiles/perl_regex-delims +120 -0
  314. data/vendor/pygments-main/tests/examplefiles/perlfunc.1 +856 -0
  315. data/vendor/pygments-main/tests/examplefiles/phpMyAdmin.spec +163 -0
  316. data/vendor/pygments-main/tests/examplefiles/phpcomplete.vim +567 -0
  317. data/vendor/pygments-main/tests/examplefiles/pleac.in.rb +1223 -0
  318. data/vendor/pygments-main/tests/examplefiles/postgresql_test.txt +47 -0
  319. data/vendor/pygments-main/tests/examplefiles/pppoe.applescript +10 -0
  320. data/vendor/pygments-main/tests/examplefiles/psql_session.txt +122 -0
  321. data/vendor/pygments-main/tests/examplefiles/py3_test.txt +2 -0
  322. data/vendor/pygments-main/tests/examplefiles/py3tb_test.py3tb +4 -0
  323. data/vendor/pygments-main/tests/examplefiles/pycon_test.pycon +14 -0
  324. data/vendor/pygments-main/tests/examplefiles/pytb_test2.pytb +2 -0
  325. data/vendor/pygments-main/tests/examplefiles/pytb_test3.pytb +4 -0
  326. data/vendor/pygments-main/tests/examplefiles/python25-bsd.mak +234 -0
  327. data/vendor/pygments-main/tests/examplefiles/qsort.prolog +13 -0
  328. data/vendor/pygments-main/tests/examplefiles/r-console-transcript.Rout +38 -0
  329. data/vendor/pygments-main/tests/examplefiles/ragel-cpp_rlscan +280 -0
  330. data/vendor/pygments-main/tests/examplefiles/ragel-cpp_snippet +2 -0
  331. data/vendor/pygments-main/tests/examplefiles/regex.js +22 -0
  332. data/vendor/pygments-main/tests/examplefiles/reversi.lsp +427 -0
  333. data/vendor/pygments-main/tests/examplefiles/robotframework.txt +39 -0
  334. data/vendor/pygments-main/tests/examplefiles/ruby_func_def.rb +11 -0
  335. data/vendor/pygments-main/tests/examplefiles/rust_example.rs +233 -0
  336. data/vendor/pygments-main/tests/examplefiles/scilab.sci +30 -0
  337. data/vendor/pygments-main/tests/examplefiles/session.dylan-console +9 -0
  338. data/vendor/pygments-main/tests/examplefiles/sibling.prolog +19 -0
  339. data/vendor/pygments-main/tests/examplefiles/simple.md +747 -0
  340. data/vendor/pygments-main/tests/examplefiles/smarty_example.html +209 -0
  341. data/vendor/pygments-main/tests/examplefiles/source.lgt +343 -0
  342. data/vendor/pygments-main/tests/examplefiles/sources.list +62 -0
  343. data/vendor/pygments-main/tests/examplefiles/sphere.pov +18 -0
  344. data/vendor/pygments-main/tests/examplefiles/sqlite3.sqlite3-console +27 -0
  345. data/vendor/pygments-main/tests/examplefiles/squid.conf +30 -0
  346. data/vendor/pygments-main/tests/examplefiles/string.jl +1031 -0
  347. data/vendor/pygments-main/tests/examplefiles/string_delimiters.d +21 -0
  348. data/vendor/pygments-main/tests/examplefiles/stripheredoc.sh +3 -0
  349. data/vendor/pygments-main/tests/examplefiles/swig_java.swg +1329 -0
  350. data/vendor/pygments-main/tests/examplefiles/swig_std_vector.i +225 -0
  351. data/vendor/pygments-main/tests/examplefiles/test.R +153 -0
  352. data/vendor/pygments-main/tests/examplefiles/test.adb +211 -0
  353. data/vendor/pygments-main/tests/examplefiles/test.agda +102 -0
  354. data/vendor/pygments-main/tests/examplefiles/test.asy +131 -0
  355. data/vendor/pygments-main/tests/examplefiles/test.awk +121 -0
  356. data/vendor/pygments-main/tests/examplefiles/test.bas +29 -0
  357. data/vendor/pygments-main/tests/examplefiles/test.bb +95 -0
  358. data/vendor/pygments-main/tests/examplefiles/test.bmx +145 -0
  359. data/vendor/pygments-main/tests/examplefiles/test.boo +39 -0
  360. data/vendor/pygments-main/tests/examplefiles/test.bro +250 -0
  361. data/vendor/pygments-main/tests/examplefiles/test.cs +374 -0
  362. data/vendor/pygments-main/tests/examplefiles/test.css +54 -0
  363. data/vendor/pygments-main/tests/examplefiles/test.cu +36 -0
  364. data/vendor/pygments-main/tests/examplefiles/test.d +135 -0
  365. data/vendor/pygments-main/tests/examplefiles/test.dart +23 -0
  366. data/vendor/pygments-main/tests/examplefiles/test.dtd +89 -0
  367. data/vendor/pygments-main/tests/examplefiles/test.ebnf +31 -0
  368. data/vendor/pygments-main/tests/examplefiles/test.ec +605 -0
  369. data/vendor/pygments-main/tests/examplefiles/test.ecl +58 -0
  370. data/vendor/pygments-main/tests/examplefiles/test.eh +315 -0
  371. data/vendor/pygments-main/tests/examplefiles/test.erl +169 -0
  372. data/vendor/pygments-main/tests/examplefiles/test.evoque +33 -0
  373. data/vendor/pygments-main/tests/examplefiles/test.fan +818 -0
  374. data/vendor/pygments-main/tests/examplefiles/test.flx +57 -0
  375. data/vendor/pygments-main/tests/examplefiles/test.gdc +13 -0
  376. data/vendor/pygments-main/tests/examplefiles/test.groovy +97 -0
  377. data/vendor/pygments-main/tests/examplefiles/test.html +339 -0
  378. data/vendor/pygments-main/tests/examplefiles/test.ini +10 -0
  379. data/vendor/pygments-main/tests/examplefiles/test.java +653 -0
  380. data/vendor/pygments-main/tests/examplefiles/test.jsp +24 -0
  381. data/vendor/pygments-main/tests/examplefiles/test.maql +45 -0
  382. data/vendor/pygments-main/tests/examplefiles/test.mod +374 -0
  383. data/vendor/pygments-main/tests/examplefiles/test.moo +51 -0
  384. data/vendor/pygments-main/tests/examplefiles/test.myt +166 -0
  385. data/vendor/pygments-main/tests/examplefiles/test.nim +93 -0
  386. data/vendor/pygments-main/tests/examplefiles/test.opa +10 -0
  387. data/vendor/pygments-main/tests/examplefiles/test.p6 +252 -0
  388. data/vendor/pygments-main/tests/examplefiles/test.pas +743 -0
  389. data/vendor/pygments-main/tests/examplefiles/test.php +505 -0
  390. data/vendor/pygments-main/tests/examplefiles/test.plot +333 -0
  391. data/vendor/pygments-main/tests/examplefiles/test.ps1 +108 -0
  392. data/vendor/pygments-main/tests/examplefiles/test.pypylog +1839 -0
  393. data/vendor/pygments-main/tests/examplefiles/test.r3 +94 -0
  394. data/vendor/pygments-main/tests/examplefiles/test.rb +177 -0
  395. data/vendor/pygments-main/tests/examplefiles/test.rhtml +43 -0
  396. data/vendor/pygments-main/tests/examplefiles/test.scaml +8 -0
  397. data/vendor/pygments-main/tests/examplefiles/test.ssp +12 -0
  398. data/vendor/pygments-main/tests/examplefiles/test.tcsh +830 -0
  399. data/vendor/pygments-main/tests/examplefiles/test.vb +407 -0
  400. data/vendor/pygments-main/tests/examplefiles/test.vhdl +161 -0
  401. data/vendor/pygments-main/tests/examplefiles/test.xqy +138 -0
  402. data/vendor/pygments-main/tests/examplefiles/test.xsl +23 -0
  403. data/vendor/pygments-main/tests/examplefiles/test2.pypylog +120 -0
  404. data/vendor/pygments-main/tests/examplefiles/truncated.pytb +15 -0
  405. data/vendor/pygments-main/tests/examplefiles/type.lisp +1218 -0
  406. data/vendor/pygments-main/tests/examplefiles/underscore.coffee +603 -0
  407. data/vendor/pygments-main/tests/examplefiles/unicode.applescript +5 -0
  408. data/vendor/pygments-main/tests/examplefiles/unicodedoc.py +11 -0
  409. data/vendor/pygments-main/tests/examplefiles/unix-io.lid +37 -0
  410. data/vendor/pygments-main/tests/examplefiles/webkit-transition.css +3 -0
  411. data/vendor/pygments-main/tests/examplefiles/while.pov +13 -0
  412. data/vendor/pygments-main/tests/examplefiles/wiki.factor +384 -0
  413. data/vendor/pygments-main/tests/examplefiles/xml_example +1897 -0
  414. data/vendor/pygments-main/tests/examplefiles/zmlrpc.f90 +798 -0
  415. data/vendor/pygments-main/tests/old_run.py +138 -0
  416. data/vendor/pygments-main/tests/run.py +49 -0
  417. data/vendor/pygments-main/tests/support/tags +36 -0
  418. data/vendor/pygments-main/tests/support.py +15 -0
  419. data/vendor/pygments-main/tests/test_basic_api.py +295 -0
  420. data/vendor/pygments-main/tests/test_clexer.py +31 -0
  421. data/vendor/pygments-main/tests/test_cmdline.py +105 -0
  422. data/vendor/pygments-main/tests/test_examplefiles.py +99 -0
  423. data/vendor/pygments-main/tests/test_html_formatter.py +178 -0
  424. data/vendor/pygments-main/tests/test_latex_formatter.py +55 -0
  425. data/vendor/pygments-main/tests/test_lexers_other.py +68 -0
  426. data/vendor/pygments-main/tests/test_perllexer.py +137 -0
  427. data/vendor/pygments-main/tests/test_regexlexer.py +47 -0
  428. data/vendor/pygments-main/tests/test_token.py +46 -0
  429. data/vendor/pygments-main/tests/test_using_api.py +40 -0
  430. data/vendor/pygments-main/tests/test_util.py +135 -0
  431. data/vendor/simplejson/.gitignore +10 -0
  432. data/vendor/simplejson/.travis.yml +5 -0
  433. data/vendor/simplejson/CHANGES.txt +291 -0
  434. data/vendor/simplejson/LICENSE.txt +19 -0
  435. data/vendor/simplejson/MANIFEST.in +5 -0
  436. data/vendor/simplejson/README.rst +19 -0
  437. data/vendor/simplejson/conf.py +179 -0
  438. data/vendor/simplejson/index.rst +628 -0
  439. data/vendor/simplejson/scripts/make_docs.py +18 -0
  440. data/vendor/simplejson/setup.py +104 -0
  441. data/vendor/simplejson/simplejson/__init__.py +510 -0
  442. data/vendor/simplejson/simplejson/_speedups.c +2745 -0
  443. data/vendor/simplejson/simplejson/decoder.py +425 -0
  444. data/vendor/simplejson/simplejson/encoder.py +567 -0
  445. data/vendor/simplejson/simplejson/ordered_dict.py +119 -0
  446. data/vendor/simplejson/simplejson/scanner.py +77 -0
  447. data/vendor/simplejson/simplejson/tests/__init__.py +67 -0
  448. data/vendor/simplejson/simplejson/tests/test_bigint_as_string.py +55 -0
  449. data/vendor/simplejson/simplejson/tests/test_check_circular.py +30 -0
  450. data/vendor/simplejson/simplejson/tests/test_decimal.py +66 -0
  451. data/vendor/simplejson/simplejson/tests/test_decode.py +83 -0
  452. data/vendor/simplejson/simplejson/tests/test_default.py +9 -0
  453. data/vendor/simplejson/simplejson/tests/test_dump.py +67 -0
  454. data/vendor/simplejson/simplejson/tests/test_encode_basestring_ascii.py +46 -0
  455. data/vendor/simplejson/simplejson/tests/test_encode_for_html.py +32 -0
  456. data/vendor/simplejson/simplejson/tests/test_errors.py +34 -0
  457. data/vendor/simplejson/simplejson/tests/test_fail.py +91 -0
  458. data/vendor/simplejson/simplejson/tests/test_float.py +19 -0
  459. data/vendor/simplejson/simplejson/tests/test_indent.py +86 -0
  460. data/vendor/simplejson/simplejson/tests/test_item_sort_key.py +20 -0
  461. data/vendor/simplejson/simplejson/tests/test_namedtuple.py +121 -0
  462. data/vendor/simplejson/simplejson/tests/test_pass1.py +76 -0
  463. data/vendor/simplejson/simplejson/tests/test_pass2.py +14 -0
  464. data/vendor/simplejson/simplejson/tests/test_pass3.py +20 -0
  465. data/vendor/simplejson/simplejson/tests/test_recursion.py +67 -0
  466. data/vendor/simplejson/simplejson/tests/test_scanstring.py +117 -0
  467. data/vendor/simplejson/simplejson/tests/test_separators.py +42 -0
  468. data/vendor/simplejson/simplejson/tests/test_speedups.py +20 -0
  469. data/vendor/simplejson/simplejson/tests/test_tuple.py +49 -0
  470. data/vendor/simplejson/simplejson/tests/test_unicode.py +109 -0
  471. data/vendor/simplejson/simplejson/tool.py +39 -0
  472. metadata +566 -0
@@ -0,0 +1,1472 @@
1
+ Functional Programming HOWTO
2
+ ================================
3
+
4
+ **Version 0.30**
5
+
6
+ (This is a first draft. Please send comments/error
7
+ reports/suggestions to amk@amk.ca. This URL is probably not going to
8
+ be the final location of the document, so be careful about linking to
9
+ it -- you may want to add a disclaimer.)
10
+
11
+ In this document, we'll take a tour of Python's features suitable for
12
+ implementing programs in a functional style. After an introduction to
13
+ the concepts of functional programming, we'll look at language
14
+ features such as iterators and generators and relevant library modules
15
+ such as ``itertools`` and ``functools``.
16
+
17
+
18
+ .. contents::
19
+
20
+ Introduction
21
+ ----------------------
22
+
23
+ This section explains the basic concept of functional programming; if
24
+ you're just interested in learning about Python language features,
25
+ skip to the next section.
26
+
27
+ Programming languages support decomposing problems in several different
28
+ ways:
29
+
30
+ * Most programming languages are **procedural**:
31
+ programs are lists of instructions that tell the computer what to
32
+ do with the program's input.
33
+ C, Pascal, and even Unix shells are procedural languages.
34
+
35
+ * In **declarative** languages, you write a specification that describes
36
+ the problem to be solved, and the language implementation figures out
37
+ how to perform the computation efficiently. SQL is the declarative
38
+ language you're most likely to be familiar with; a SQL query describes
39
+ the data set you want to retrieve, and the SQL engine decides whether to
40
+ scan tables or use indexes, which subclauses should be performed first,
41
+ etc.
42
+
43
+ * **Object-oriented** programs manipulate collections of objects.
44
+ Objects have internal state and support methods that query or modify
45
+ this internal state in some way. Smalltalk and Java are
46
+ object-oriented languages. C++ and Python are languages that
47
+ support object-oriented programming, but don't force the use
48
+ of object-oriented features.
49
+
50
+ * **Functional** programming decomposes a problem into a set of functions.
51
+ Ideally, functions only take inputs and produce outputs, and don't have any
52
+ internal state that affects the output produced for a given input.
53
+ Well-known functional languages include the ML family (Standard ML,
54
+ OCaml, and other variants) and Haskell.
55
+
56
+ The designers of some computer languages have chosen one approach to
57
+ programming that's emphasized. This often makes it difficult to
58
+ write programs that use a different approach. Other languages are
59
+ multi-paradigm languages that support several different approaches. Lisp,
60
+ C++, and Python are multi-paradigm; you can write programs or
61
+ libraries that are largely procedural, object-oriented, or functional
62
+ in all of these languages. In a large program, different sections
63
+ might be written using different approaches; the GUI might be object-oriented
64
+ while the processing logic is procedural or functional, for example.
65
+
66
+ In a functional program, input flows through a set of functions. Each
67
+ function operates on its input and produces some output. Functional
68
+ style frowns upon functions with side effects that modify internal
69
+ state or make other changes that aren't visible in the function's
70
+ return value. Functions that have no side effects at all are
71
+ called **purely functional**.
72
+ Avoiding side effects means not using data structures
73
+ that get updated as a program runs; every function's output
74
+ must only depend on its input.
75
+
76
+ Some languages are very strict about purity and don't even have
77
+ assignment statements such as ``a=3`` or ``c = a + b``, but it's
78
+ difficult to avoid all side effects. Printing to the screen or
79
+ writing to a disk file are side effects, for example. For example, in
80
+ Python a ``print`` statement or a ``time.sleep(1)`` both return no
81
+ useful value; they're only called for their side effects of sending
82
+ some text to the screen or pausing execution for a second.
83
+
84
+ Python programs written in functional style usually won't go to the
85
+ extreme of avoiding all I/O or all assignments; instead, they'll
86
+ provide a functional-appearing interface but will use non-functional
87
+ features internally. For example, the implementation of a function
88
+ will still use assignments to local variables, but won't modify global
89
+ variables or have other side effects.
90
+
91
+ Functional programming can be considered the opposite of
92
+ object-oriented programming. Objects are little capsules containing
93
+ some internal state along with a collection of method calls that let
94
+ you modify this state, and programs consist of making the right set of
95
+ state changes. Functional programming wants to avoid state changes as
96
+ much as possible and works with data flowing between functions. In
97
+ Python you might combine the two approaches by writing functions that
98
+ take and return instances representing objects in your application
99
+ (e-mail messages, transactions, etc.).
100
+
101
+ Functional design may seem like an odd constraint to work under. Why
102
+ should you avoid objects and side effects? There are theoretical and
103
+ practical advantages to the functional style:
104
+
105
+ * Formal provability.
106
+ * Modularity.
107
+ * Composability.
108
+ * Ease of debugging and testing.
109
+
110
+ Formal provability
111
+ ''''''''''''''''''''''
112
+
113
+ A theoretical benefit is that it's easier to construct a mathematical proof
114
+ that a functional program is correct.
115
+
116
+ For a long time researchers have been interested in finding ways to
117
+ mathematically prove programs correct. This is different from testing
118
+ a program on numerous inputs and concluding that its output is usually
119
+ correct, or reading a program's source code and concluding that the
120
+ code looks right; the goal is instead a rigorous proof that a program
121
+ produces the right result for all possible inputs.
122
+
123
+ The technique used to prove programs correct is to write down
124
+ **invariants**, properties of the input data and of the program's
125
+ variables that are always true. For each line of code, you then show
126
+ that if invariants X and Y are true **before** the line is executed,
127
+ the slightly different invariants X' and Y' are true **after**
128
+ the line is executed. This continues until you reach the end of the
129
+ program, at which point the invariants should match the desired
130
+ conditions on the program's output.
131
+
132
+ Functional programming's avoidance of assignments arose because
133
+ assignments are difficult to handle with this technique;
134
+ assignments can break invariants that were true before the assignment
135
+ without producing any new invariants that can be propagated onward.
136
+
137
+ Unfortunately, proving programs correct is largely impractical and not
138
+ relevant to Python software. Even trivial programs require proofs that
139
+ are several pages long; the proof of correctness for a moderately
140
+ complicated program would be enormous, and few or none of the programs
141
+ you use daily (the Python interpreter, your XML parser, your web
142
+ browser) could be proven correct. Even if you wrote down or generated
143
+ a proof, there would then be the question of verifying the proof;
144
+ maybe there's an error in it, and you wrongly believe you've proved
145
+ the program correct.
146
+
147
+ Modularity
148
+ ''''''''''''''''''''''
149
+
150
+ A more practical benefit of functional programming is that it forces
151
+ you to break apart your problem into small pieces. Programs are more
152
+ modular as a result. It's easier to specify and write a small
153
+ function that does one thing than a large function that performs a
154
+ complicated transformation. Small functions are also easier to read
155
+ and to check for errors.
156
+
157
+
158
+ Ease of debugging and testing
159
+ ''''''''''''''''''''''''''''''''''
160
+
161
+ Testing and debugging a functional-style program is easier.
162
+
163
+ Debugging is simplified because functions are generally small and
164
+ clearly specified. When a program doesn't work, each function is an
165
+ interface point where you can check that the data are correct. You
166
+ can look at the intermediate inputs and outputs to quickly isolate the
167
+ function that's responsible for a bug.
168
+
169
+ Testing is easier because each function is a potential subject for a
170
+ unit test. Functions don't depend on system state that needs to be
171
+ replicated before running a test; instead you only have to synthesize
172
+ the right input and then check that the output matches expectations.
173
+
174
+
175
+
176
+ Composability
177
+ ''''''''''''''''''''''
178
+
179
+ As you work on a functional-style program, you'll write a number of
180
+ functions with varying inputs and outputs. Some of these functions
181
+ will be unavoidably specialized to a particular application, but
182
+ others will be useful in a wide variety of programs. For example, a
183
+ function that takes a directory path and returns all the XML files in
184
+ the directory, or a function that takes a filename and returns its
185
+ contents, can be applied to many different situations.
186
+
187
+ Over time you'll form a personal library of utilities. Often you'll
188
+ assemble new programs by arranging existing functions in a new
189
+ configuration and writing a few functions specialized for the current
190
+ task.
191
+
192
+
193
+
194
+ Iterators
195
+ -----------------------
196
+
197
+ I'll start by looking at a Python language feature that's an important
198
+ foundation for writing functional-style programs: iterators.
199
+
200
+ An iterator is an object representing a stream of data; this object
201
+ returns the data one element at a time. A Python iterator must
202
+ support a method called ``next()`` that takes no arguments and always
203
+ returns the next element of the stream. If there are no more elements
204
+ in the stream, ``next()`` must raise the ``StopIteration`` exception.
205
+ Iterators don't have to be finite, though; it's perfectly reasonable
206
+ to write an iterator that produces an infinite stream of data.
207
+
208
+ The built-in ``iter()`` function takes an arbitrary object and tries
209
+ to return an iterator that will return the object's contents or
210
+ elements, raising ``TypeError`` if the object doesn't support
211
+ iteration. Several of Python's built-in data types support iteration,
212
+ the most common being lists and dictionaries. An object is called
213
+ an **iterable** object if you can get an iterator for it.
214
+
215
+ You can experiment with the iteration interface manually::
216
+
217
+ >>> L = [1,2,3]
218
+ >>> it = iter(L)
219
+ >>> print it
220
+ <iterator object at 0x8116870>
221
+ >>> it.next()
222
+ 1
223
+ >>> it.next()
224
+ 2
225
+ >>> it.next()
226
+ 3
227
+ >>> it.next()
228
+ Traceback (most recent call last):
229
+ File "<stdin>", line 1, in ?
230
+ StopIteration
231
+ >>>
232
+
233
+ Python expects iterable objects in several different contexts, the
234
+ most important being the ``for`` statement. In the statement ``for X in Y``,
235
+ Y must be an iterator or some object for which ``iter()`` can create
236
+ an iterator. These two statements are equivalent::
237
+
238
+ for i in iter(obj):
239
+ print i
240
+
241
+ for i in obj:
242
+ print i
243
+
244
+ Iterators can be materialized as lists or tuples by using the
245
+ ``list()`` or ``tuple()`` constructor functions::
246
+
247
+ >>> L = [1,2,3]
248
+ >>> iterator = iter(L)
249
+ >>> t = tuple(iterator)
250
+ >>> t
251
+ (1, 2, 3)
252
+
253
+ Sequence unpacking also supports iterators: if you know an iterator
254
+ will return N elements, you can unpack them into an N-tuple::
255
+
256
+ >>> L = [1,2,3]
257
+ >>> iterator = iter(L)
258
+ >>> a,b,c = iterator
259
+ >>> a,b,c
260
+ (1, 2, 3)
261
+
262
+ Built-in functions such as ``max()`` and ``min()`` can take a single
263
+ iterator argument and will return the largest or smallest element.
264
+ The ``"in"`` and ``"not in"`` operators also support iterators: ``X in
265
+ iterator`` is true if X is found in the stream returned by the
266
+ iterator. You'll run into obvious problems if the iterator is
267
+ infinite; ``max()``, ``min()``, and ``"not in"`` will never return, and
268
+ if the element X never appears in the stream, the ``"in"`` operator
269
+ won't return either.
270
+
271
+ Note that you can only go forward in an iterator; there's no way to
272
+ get the previous element, reset the iterator, or make a copy of it.
273
+ Iterator objects can optionally provide these additional capabilities,
274
+ but the iterator protocol only specifies the ``next()`` method.
275
+ Functions may therefore consume all of the iterator's output, and if
276
+ you need to do something different with the same stream, you'll have
277
+ to create a new iterator.
278
+
279
+
280
+
281
+ Data Types That Support Iterators
282
+ '''''''''''''''''''''''''''''''''''
283
+
284
+ We've already seen how lists and tuples support iterators. In fact,
285
+ any Python sequence type, such as strings, will automatically support
286
+ creation of an iterator.
287
+
288
+ Calling ``iter()`` on a dictionary returns an iterator that will loop
289
+ over the dictionary's keys::
290
+
291
+ >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
292
+ ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
293
+ >>> for key in m:
294
+ ... print key, m[key]
295
+ Mar 3
296
+ Feb 2
297
+ Aug 8
298
+ Sep 9
299
+ May 5
300
+ Jun 6
301
+ Jul 7
302
+ Jan 1
303
+ Apr 4
304
+ Nov 11
305
+ Dec 12
306
+ Oct 10
307
+
308
+ Note that the order is essentially random, because it's based on the
309
+ hash ordering of the objects in the dictionary.
310
+
311
+ Applying ``iter()`` to a dictionary always loops over the keys, but
312
+ dictionaries have methods that return other iterators. If you want to
313
+ iterate over keys, values, or key/value pairs, you can explicitly call
314
+ the ``iterkeys()``, ``itervalues()``, or ``iteritems()`` methods to
315
+ get an appropriate iterator.
316
+
317
+ The ``dict()`` constructor can accept an iterator that returns a
318
+ finite stream of ``(key, value)`` tuples::
319
+
320
+ >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
321
+ >>> dict(iter(L))
322
+ {'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'}
323
+
324
+ Files also support iteration by calling the ``readline()``
325
+ method until there are no more lines in the file. This means you can
326
+ read each line of a file like this::
327
+
328
+ for line in file:
329
+ # do something for each line
330
+ ...
331
+
332
+ Sets can take their contents from an iterable and let you iterate over
333
+ the set's elements::
334
+
335
+ S = set((2, 3, 5, 7, 11, 13))
336
+ for i in S:
337
+ print i
338
+
339
+
340
+
341
+ Generator expressions and list comprehensions
342
+ ----------------------------------------------------
343
+
344
+ Two common operations on an iterator's output are 1) performing some
345
+ operation for every element, 2) selecting a subset of elements that
346
+ meet some condition. For example, given a list of strings, you might
347
+ want to strip off trailing whitespace from each line or extract all
348
+ the strings containing a given substring.
349
+
350
+ List comprehensions and generator expressions (short form: "listcomps"
351
+ and "genexps") are a concise notation for such operations, borrowed
352
+ from the functional programming language Haskell
353
+ (http://www.haskell.org). You can strip all the whitespace from a
354
+ stream of strings with the following code::
355
+
356
+ line_list = [' line 1\n', 'line 2 \n', ...]
357
+
358
+ # Generator expression -- returns iterator
359
+ stripped_iter = (line.strip() for line in line_list)
360
+
361
+ # List comprehension -- returns list
362
+ stripped_list = [line.strip() for line in line_list]
363
+
364
+ You can select only certain elements by adding an ``"if"`` condition::
365
+
366
+ stripped_list = [line.strip() for line in line_list
367
+ if line != ""]
368
+
369
+ With a list comprehension, you get back a Python list;
370
+ ``stripped_list`` is a list containing the resulting lines, not an
371
+ iterator. Generator expressions return an iterator that computes the
372
+ values as necessary, not needing to materialize all the values at
373
+ once. This means that list comprehensions aren't useful if you're
374
+ working with iterators that return an infinite stream or a very large
375
+ amount of data. Generator expressions are preferable in these
376
+ situations.
377
+
378
+ Generator expressions are surrounded by parentheses ("()") and list
379
+ comprehensions are surrounded by square brackets ("[]"). Generator
380
+ expressions have the form::
381
+
382
+ ( expression for expr in sequence1
383
+ if condition1
384
+ for expr2 in sequence2
385
+ if condition2
386
+ for expr3 in sequence3 ...
387
+ if condition3
388
+ for exprN in sequenceN
389
+ if conditionN )
390
+
391
+ Again, for a list comprehension only the outside brackets are
392
+ different (square brackets instead of parentheses).
393
+
394
+ The elements of the generated output will be the successive values of
395
+ ``expression``. The ``if`` clauses are all optional; if present,
396
+ ``expression`` is only evaluated and added to the result when
397
+ ``condition`` is true.
398
+
399
+ Generator expressions always have to be written inside parentheses,
400
+ but the parentheses signalling a function call also count. If you
401
+ want to create an iterator that will be immediately passed to a
402
+ function you can write::
403
+
404
+ obj_total = sum(obj.count for obj in list_all_objects())
405
+
406
+ The ``for...in`` clauses contain the sequences to be iterated over.
407
+ The sequences do not have to be the same length, because they are
408
+ iterated over from left to right, **not** in parallel. For each
409
+ element in ``sequence1``, ``sequence2`` is looped over from the
410
+ beginning. ``sequence3`` is then looped over for each
411
+ resulting pair of elements from ``sequence1`` and ``sequence2``.
412
+
413
+ To put it another way, a list comprehension or generator expression is
414
+ equivalent to the following Python code::
415
+
416
+ for expr1 in sequence1:
417
+ if not (condition1):
418
+ continue # Skip this element
419
+ for expr2 in sequence2:
420
+ if not (condition2):
421
+ continue # Skip this element
422
+ ...
423
+ for exprN in sequenceN:
424
+ if not (conditionN):
425
+ continue # Skip this element
426
+
427
+ # Output the value of
428
+ # the expression.
429
+
430
+ This means that when there are multiple ``for...in`` clauses but no
431
+ ``if`` clauses, the length of the resulting output will be equal to
432
+ the product of the lengths of all the sequences. If you have two
433
+ lists of length 3, the output list is 9 elements long::
434
+
435
+ seq1 = 'abc'
436
+ seq2 = (1,2,3)
437
+ >>> [ (x,y) for x in seq1 for y in seq2]
438
+ [('a', 1), ('a', 2), ('a', 3),
439
+ ('b', 1), ('b', 2), ('b', 3),
440
+ ('c', 1), ('c', 2), ('c', 3)]
441
+
442
+ To avoid introducing an ambiguity into Python's grammar, if
443
+ ``expression`` is creating a tuple, it must be surrounded with
444
+ parentheses. The first list comprehension below is a syntax error,
445
+ while the second one is correct::
446
+
447
+ # Syntax error
448
+ [ x,y for x in seq1 for y in seq2]
449
+ # Correct
450
+ [ (x,y) for x in seq1 for y in seq2]
451
+
452
+
453
+ Generators
454
+ -----------------------
455
+
456
+ Generators are a special class of functions that simplify the task of
457
+ writing iterators. Regular functions compute a value and return it,
458
+ but generators return an iterator that returns a stream of values.
459
+
460
+ You're doubtless familiar with how regular function calls work in
461
+ Python or C. When you call a function, it gets a private namespace
462
+ where its local variables are created. When the function reaches a
463
+ ``return`` statement, the local variables are destroyed and the
464
+ value is returned to the caller. A later call to the same function
465
+ creates a new private namespace and a fresh set of local
466
+ variables. But, what if the local variables weren't thrown away on
467
+ exiting a function? What if you could later resume the function where
468
+ it left off? This is what generators provide; they can be thought of
469
+ as resumable functions.
470
+
471
+ Here's the simplest example of a generator function::
472
+
473
+ def generate_ints(N):
474
+ for i in range(N):
475
+ yield i
476
+
477
+ Any function containing a ``yield`` keyword is a generator function;
478
+ this is detected by Python's bytecode compiler which compiles the
479
+ function specially as a result.
480
+
481
+ When you call a generator function, it doesn't return a single value;
482
+ instead it returns a generator object that supports the iterator
483
+ protocol. On executing the ``yield`` expression, the generator
484
+ outputs the value of ``i``, similar to a ``return``
485
+ statement. The big difference between ``yield`` and a
486
+ ``return`` statement is that on reaching a ``yield`` the
487
+ generator's state of execution is suspended and local variables are
488
+ preserved. On the next call to the generator's ``.next()`` method,
489
+ the function will resume executing.
490
+
491
+ Here's a sample usage of the ``generate_ints()`` generator::
492
+
493
+ >>> gen = generate_ints(3)
494
+ >>> gen
495
+ <generator object at 0x8117f90>
496
+ >>> gen.next()
497
+ 0
498
+ >>> gen.next()
499
+ 1
500
+ >>> gen.next()
501
+ 2
502
+ >>> gen.next()
503
+ Traceback (most recent call last):
504
+ File "stdin", line 1, in ?
505
+ File "stdin", line 2, in generate_ints
506
+ StopIteration
507
+
508
+ You could equally write ``for i in generate_ints(5)``, or
509
+ ``a,b,c = generate_ints(3)``.
510
+
511
+ Inside a generator function, the ``return`` statement can only be used
512
+ without a value, and signals the end of the procession of values;
513
+ after executing a ``return`` the generator cannot return any further
514
+ values. ``return`` with a value, such as ``return 5``, is a syntax
515
+ error inside a generator function. The end of the generator's results
516
+ can also be indicated by raising ``StopIteration`` manually, or by
517
+ just letting the flow of execution fall off the bottom of the
518
+ function.
519
+
520
+ You could achieve the effect of generators manually by writing your
521
+ own class and storing all the local variables of the generator as
522
+ instance variables. For example, returning a list of integers could
523
+ be done by setting ``self.count`` to 0, and having the
524
+ ``next()`` method increment ``self.count`` and return it.
525
+ However, for a moderately complicated generator, writing a
526
+ corresponding class can be much messier.
527
+
528
+ The test suite included with Python's library, ``test_generators.py``,
529
+ contains a number of more interesting examples. Here's one generator
530
+ that implements an in-order traversal of a tree using generators
531
+ recursively.
532
+
533
+ ::
534
+
535
+ # A recursive generator that generates Tree leaves in in-order.
536
+ def inorder(t):
537
+ if t:
538
+ for x in inorder(t.left):
539
+ yield x
540
+
541
+ yield t.label
542
+
543
+ for x in inorder(t.right):
544
+ yield x
545
+
546
+ Two other examples in ``test_generators.py`` produce
547
+ solutions for the N-Queens problem (placing N queens on an NxN
548
+ chess board so that no queen threatens another) and the Knight's Tour
549
+ (finding a route that takes a knight to every square of an NxN chessboard
550
+ without visiting any square twice).
551
+
552
+
553
+
554
+ Passing values into a generator
555
+ ''''''''''''''''''''''''''''''''''''''''''''''
556
+
557
+ In Python 2.4 and earlier, generators only produced output. Once a
558
+ generator's code was invoked to create an iterator, there was no way to
559
+ pass any new information into the function when its execution is
560
+ resumed. You could hack together this ability by making the
561
+ generator look at a global variable or by passing in some mutable object
562
+ that callers then modify, but these approaches are messy.
563
+
564
+ In Python 2.5 there's a simple way to pass values into a generator.
565
+ ``yield`` became an expression, returning a value that can be assigned
566
+ to a variable or otherwise operated on::
567
+
568
+ val = (yield i)
569
+
570
+ I recommend that you **always** put parentheses around a ``yield``
571
+ expression when you're doing something with the returned value, as in
572
+ the above example. The parentheses aren't always necessary, but it's
573
+ easier to always add them instead of having to remember when they're
574
+ needed.
575
+
576
+ (PEP 342 explains the exact rules, which are that a
577
+ ``yield``-expression must always be parenthesized except when it
578
+ occurs at the top-level expression on the right-hand side of an
579
+ assignment. This means you can write ``val = yield i`` but have to
580
+ use parentheses when there's an operation, as in ``val = (yield i)
581
+ + 12``.)
582
+
583
+ Values are sent into a generator by calling its
584
+ ``send(value)`` method. This method resumes the
585
+ generator's code and the ``yield`` expression returns the specified
586
+ value. If the regular ``next()`` method is called, the
587
+ ``yield`` returns ``None``.
588
+
589
+ Here's a simple counter that increments by 1 and allows changing the
590
+ value of the internal counter.
591
+
592
+ ::
593
+
594
+ def counter (maximum):
595
+ i = 0
596
+ while i < maximum:
597
+ val = (yield i)
598
+ # If value provided, change counter
599
+ if val is not None:
600
+ i = val
601
+ else:
602
+ i += 1
603
+
604
+ And here's an example of changing the counter:
605
+
606
+ >>> it = counter(10)
607
+ >>> print it.next()
608
+ 0
609
+ >>> print it.next()
610
+ 1
611
+ >>> print it.send(8)
612
+ 8
613
+ >>> print it.next()
614
+ 9
615
+ >>> print it.next()
616
+ Traceback (most recent call last):
617
+ File ``t.py'', line 15, in ?
618
+ print it.next()
619
+ StopIteration
620
+
621
+ Because ``yield`` will often be returning ``None``, you
622
+ should always check for this case. Don't just use its value in
623
+ expressions unless you're sure that the ``send()`` method
624
+ will be the only method used resume your generator function.
625
+
626
+ In addition to ``send()``, there are two other new methods on
627
+ generators:
628
+
629
+ * ``throw(type, value=None, traceback=None)`` is used to raise an exception inside the
630
+ generator; the exception is raised by the ``yield`` expression
631
+ where the generator's execution is paused.
632
+
633
+ * ``close()`` raises a ``GeneratorExit``
634
+ exception inside the generator to terminate the iteration.
635
+ On receiving this
636
+ exception, the generator's code must either raise
637
+ ``GeneratorExit`` or ``StopIteration``; catching the
638
+ exception and doing anything else is illegal and will trigger
639
+ a ``RuntimeError``. ``close()`` will also be called by
640
+ Python's garbage collector when the generator is garbage-collected.
641
+
642
+ If you need to run cleanup code when a ``GeneratorExit`` occurs,
643
+ I suggest using a ``try: ... finally:`` suite instead of
644
+ catching ``GeneratorExit``.
645
+
646
+ The cumulative effect of these changes is to turn generators from
647
+ one-way producers of information into both producers and consumers.
648
+
649
+ Generators also become **coroutines**, a more generalized form of
650
+ subroutines. Subroutines are entered at one point and exited at
651
+ another point (the top of the function, and a ``return``
652
+ statement), but coroutines can be entered, exited, and resumed at
653
+ many different points (the ``yield`` statements).
654
+
655
+
656
+ Built-in functions
657
+ ----------------------------------------------
658
+
659
+ Let's look in more detail at built-in functions often used with iterators.
660
+
661
+ Two Python's built-in functions, ``map()`` and ``filter()``, are
662
+ somewhat obsolete; they duplicate the features of list comprehensions
663
+ but return actual lists instead of iterators.
664
+
665
+ ``map(f, iterA, iterB, ...)`` returns a list containing ``f(iterA[0],
666
+ iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
667
+
668
+ ::
669
+
670
+ def upper(s):
671
+ return s.upper()
672
+ map(upper, ['sentence', 'fragment']) =>
673
+ ['SENTENCE', 'FRAGMENT']
674
+
675
+ [upper(s) for s in ['sentence', 'fragment']] =>
676
+ ['SENTENCE', 'FRAGMENT']
677
+
678
+ As shown above, you can achieve the same effect with a list
679
+ comprehension. The ``itertools.imap()`` function does the same thing
680
+ but can handle infinite iterators; it'll be discussed later, in the section on
681
+ the ``itertools`` module.
682
+
683
+ ``filter(predicate, iter)`` returns a list
684
+ that contains all the sequence elements that meet a certain condition,
685
+ and is similarly duplicated by list comprehensions.
686
+ A **predicate** is a function that returns the truth value of
687
+ some condition; for use with ``filter()``, the predicate must take a
688
+ single value.
689
+
690
+ ::
691
+
692
+ def is_even(x):
693
+ return (x % 2) == 0
694
+
695
+ filter(is_even, range(10)) =>
696
+ [0, 2, 4, 6, 8]
697
+
698
+ This can also be written as a list comprehension::
699
+
700
+ >>> [x for x in range(10) if is_even(x)]
701
+ [0, 2, 4, 6, 8]
702
+
703
+ ``filter()`` also has a counterpart in the ``itertools`` module,
704
+ ``itertools.ifilter()``, that returns an iterator and
705
+ can therefore handle infinite sequences just as ``itertools.imap()`` can.
706
+
707
+ ``reduce(func, iter, [initial_value])`` doesn't have a counterpart in
708
+ the ``itertools`` module because it cumulatively performs an operation
709
+ on all the iterable's elements and therefore can't be applied to
710
+ infinite iterables. ``func`` must be a function that takes two elements
711
+ and returns a single value. ``reduce()`` takes the first two elements
712
+ A and B returned by the iterator and calculates ``func(A, B)``. It
713
+ then requests the third element, C, calculates ``func(func(A, B),
714
+ C)``, combines this result with the fourth element returned, and
715
+ continues until the iterable is exhausted. If the iterable returns no
716
+ values at all, a ``TypeError`` exception is raised. If the initial
717
+ value is supplied, it's used as a starting point and
718
+ ``func(initial_value, A)`` is the first calculation.
719
+
720
+ ::
721
+
722
+ import operator
723
+ reduce(operator.concat, ['A', 'BB', 'C']) =>
724
+ 'ABBC'
725
+ reduce(operator.concat, []) =>
726
+ TypeError: reduce() of empty sequence with no initial value
727
+ reduce(operator.mul, [1,2,3], 1) =>
728
+ 6
729
+ reduce(operator.mul, [], 1) =>
730
+ 1
731
+
732
+ If you use ``operator.add`` with ``reduce()``, you'll add up all the
733
+ elements of the iterable. This case is so common that there's a special
734
+ built-in called ``sum()`` to compute it::
735
+
736
+ reduce(operator.add, [1,2,3,4], 0) =>
737
+ 10
738
+ sum([1,2,3,4]) =>
739
+ 10
740
+ sum([]) =>
741
+ 0
742
+
743
+ For many uses of ``reduce()``, though, it can be clearer to just write
744
+ the obvious ``for`` loop::
745
+
746
+ # Instead of:
747
+ product = reduce(operator.mul, [1,2,3], 1)
748
+
749
+ # You can write:
750
+ product = 1
751
+ for i in [1,2,3]:
752
+ product *= i
753
+
754
+
755
+ ``enumerate(iter)`` counts off the elements in the iterable, returning
756
+ 2-tuples containing the count and each element.
757
+
758
+ ::
759
+
760
+ enumerate(['subject', 'verb', 'object']) =>
761
+ (0, 'subject'), (1, 'verb'), (2, 'object')
762
+
763
+ ``enumerate()`` is often used when looping through a list
764
+ and recording the indexes at which certain conditions are met::
765
+
766
+ f = open('data.txt', 'r')
767
+ for i, line in enumerate(f):
768
+ if line.strip() == '':
769
+ print 'Blank line at line #%i' % i
770
+
771
+ ``sorted(iterable, [cmp=None], [key=None], [reverse=False)``
772
+ collects all the elements of the iterable into a list, sorts
773
+ the list, and returns the sorted result. The ``cmp``, ``key``,
774
+ and ``reverse`` arguments are passed through to the
775
+ constructed list's ``.sort()`` method.
776
+
777
+ ::
778
+
779
+ import random
780
+ # Generate 8 random numbers between [0, 10000)
781
+ rand_list = random.sample(range(10000), 8)
782
+ rand_list =>
783
+ [769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
784
+ sorted(rand_list) =>
785
+ [769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
786
+ sorted(rand_list, reverse=True) =>
787
+ [9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
788
+
789
+ (For a more detailed discussion of sorting, see the Sorting mini-HOWTO
790
+ in the Python wiki at http://wiki.python.org/moin/HowTo/Sorting.)
791
+
792
+ The ``any(iter)`` and ``all(iter)`` built-ins look at
793
+ the truth values of an iterable's contents. ``any()`` returns
794
+ True if any element in the iterable is a true value, and ``all()``
795
+ returns True if all of the elements are true values::
796
+
797
+ any([0,1,0]) =>
798
+ True
799
+ any([0,0,0]) =>
800
+ False
801
+ any([1,1,1]) =>
802
+ True
803
+ all([0,1,0]) =>
804
+ False
805
+ all([0,0,0]) =>
806
+ False
807
+ all([1,1,1]) =>
808
+ True
809
+
810
+
811
+ Small functions and the lambda statement
812
+ ----------------------------------------------
813
+
814
+ When writing functional-style programs, you'll often need little
815
+ functions that act as predicates or that combine elements in some way.
816
+
817
+ If there's a Python built-in or a module function that's suitable, you
818
+ don't need to define a new function at all::
819
+
820
+ stripped_lines = [line.strip() for line in lines]
821
+ existing_files = filter(os.path.exists, file_list)
822
+
823
+ If the function you need doesn't exist, you need to write it. One way
824
+ to write small functions is to use the ``lambda`` statement. ``lambda``
825
+ takes a number of parameters and an expression combining these parameters,
826
+ and creates a small function that returns the value of the expression::
827
+
828
+ lowercase = lambda x: x.lower()
829
+
830
+ print_assign = lambda name, value: name + '=' + str(value)
831
+
832
+ adder = lambda x, y: x+y
833
+
834
+ An alternative is to just use the ``def`` statement and define a
835
+ function in the usual way::
836
+
837
+ def lowercase(x):
838
+ return x.lower()
839
+
840
+ def print_assign(name, value):
841
+ return name + '=' + str(value)
842
+
843
+ def adder(x,y):
844
+ return x + y
845
+
846
+ Which alternative is preferable? That's a style question; my usual
847
+ course is to avoid using ``lambda``.
848
+
849
+ One reason for my preference is that ``lambda`` is quite limited in
850
+ the functions it can define. The result has to be computable as a
851
+ single expression, which means you can't have multiway
852
+ ``if... elif... else`` comparisons or ``try... except`` statements.
853
+ If you try to do too much in a ``lambda`` statement, you'll end up
854
+ with an overly complicated expression that's hard to read. Quick,
855
+ what's the following code doing?
856
+
857
+ ::
858
+
859
+ total = reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
860
+
861
+ You can figure it out, but it takes time to disentangle the expression
862
+ to figure out what's going on. Using a short nested
863
+ ``def`` statements makes things a little bit better::
864
+
865
+ def combine (a, b):
866
+ return 0, a[1] + b[1]
867
+
868
+ total = reduce(combine, items)[1]
869
+
870
+ But it would be best of all if I had simply used a ``for`` loop::
871
+
872
+ total = 0
873
+ for a, b in items:
874
+ total += b
875
+
876
+ Or the ``sum()`` built-in and a generator expression::
877
+
878
+ total = sum(b for a,b in items)
879
+
880
+ Many uses of ``reduce()`` are clearer when written as ``for`` loops.
881
+
882
+ Fredrik Lundh once suggested the following set of rules for refactoring
883
+ uses of ``lambda``:
884
+
885
+ 1) Write a lambda function.
886
+ 2) Write a comment explaining what the heck that lambda does.
887
+ 3) Study the comment for a while, and think of a name that captures
888
+ the essence of the comment.
889
+ 4) Convert the lambda to a def statement, using that name.
890
+ 5) Remove the comment.
891
+
892
+ I really like these rules, but you're free to disagree that this
893
+ lambda-free style is better.
894
+
895
+
896
+ The itertools module
897
+ -----------------------
898
+
899
+ The ``itertools`` module contains a number of commonly-used iterators
900
+ as well as functions for combining several iterators. This section
901
+ will introduce the module's contents by showing small examples.
902
+
903
+ The module's functions fall into a few broad classes:
904
+
905
+ * Functions that create a new iterator based on an existing iterator.
906
+ * Functions for treating an iterator's elements as function arguments.
907
+ * Functions for selecting portions of an iterator's output.
908
+ * A function for grouping an iterator's output.
909
+
910
+ Creating new iterators
911
+ ''''''''''''''''''''''
912
+
913
+ ``itertools.count(n)`` returns an infinite stream of
914
+ integers, increasing by 1 each time. You can optionally supply the
915
+ starting number, which defaults to 0::
916
+
917
+ itertools.count() =>
918
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
919
+ itertools.count(10) =>
920
+ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
921
+
922
+ ``itertools.cycle(iter)`` saves a copy of the contents of a provided
923
+ iterable and returns a new iterator that returns its elements from
924
+ first to last. The new iterator will repeat these elements infinitely.
925
+
926
+ ::
927
+
928
+ itertools.cycle([1,2,3,4,5]) =>
929
+ 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
930
+
931
+ ``itertools.repeat(elem, [n])`` returns the provided element ``n``
932
+ times, or returns the element endlessly if ``n`` is not provided.
933
+
934
+ ::
935
+
936
+ itertools.repeat('abc') =>
937
+ abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
938
+ itertools.repeat('abc', 5) =>
939
+ abc, abc, abc, abc, abc
940
+
941
+ ``itertools.chain(iterA, iterB, ...)`` takes an arbitrary number of
942
+ iterables as input, and returns all the elements of the first
943
+ iterator, then all the elements of the second, and so on, until all of
944
+ the iterables have been exhausted.
945
+
946
+ ::
947
+
948
+ itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
949
+ a, b, c, 1, 2, 3
950
+
951
+ ``itertools.izip(iterA, iterB, ...)`` takes one element from each iterable
952
+ and returns them in a tuple::
953
+
954
+ itertools.izip(['a', 'b', 'c'], (1, 2, 3)) =>
955
+ ('a', 1), ('b', 2), ('c', 3)
956
+
957
+ It's similiar to the built-in ``zip()`` function, but doesn't
958
+ construct an in-memory list and exhaust all the input iterators before
959
+ returning; instead tuples are constructed and returned only if they're
960
+ requested. (The technical term for this behaviour is
961
+ `lazy evaluation <http://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
962
+
963
+ This iterator is intended to be used with iterables that are all of
964
+ the same length. If the iterables are of different lengths, the
965
+ resulting stream will be the same length as the shortest iterable.
966
+
967
+ ::
968
+
969
+ itertools.izip(['a', 'b'], (1, 2, 3)) =>
970
+ ('a', 1), ('b', 2)
971
+
972
+ You should avoid doing this, though, because an element may be taken
973
+ from the longer iterators and discarded. This means you can't go on
974
+ to use the iterators further because you risk skipping a discarded
975
+ element.
976
+
977
+ ``itertools.islice(iter, [start], stop, [step])`` returns a stream
978
+ that's a slice of the iterator. With a single ``stop`` argument,
979
+ it will return the first ``stop``
980
+ elements. If you supply a starting index, you'll get ``stop-start``
981
+ elements, and if you supply a value for ``step``, elements will be
982
+ skipped accordingly. Unlike Python's string and list slicing, you
983
+ can't use negative values for ``start``, ``stop``, or ``step``.
984
+
985
+ ::
986
+
987
+ itertools.islice(range(10), 8) =>
988
+ 0, 1, 2, 3, 4, 5, 6, 7
989
+ itertools.islice(range(10), 2, 8) =>
990
+ 2, 3, 4, 5, 6, 7
991
+ itertools.islice(range(10), 2, 8, 2) =>
992
+ 2, 4, 6
993
+
994
+ ``itertools.tee(iter, [n])`` replicates an iterator; it returns ``n``
995
+ independent iterators that will all return the contents of the source
996
+ iterator. If you don't supply a value for ``n``, the default is 2.
997
+ Replicating iterators requires saving some of the contents of the source
998
+ iterator, so this can consume significant memory if the iterator is large
999
+ and one of the new iterators is consumed more than the others.
1000
+
1001
+ ::
1002
+
1003
+ itertools.tee( itertools.count() ) =>
1004
+ iterA, iterB
1005
+
1006
+ where iterA ->
1007
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
1008
+
1009
+ and iterB ->
1010
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
1011
+
1012
+
1013
+ Calling functions on elements
1014
+ '''''''''''''''''''''''''''''
1015
+
1016
+ Two functions are used for calling other functions on the contents of an
1017
+ iterable.
1018
+
1019
+ ``itertools.imap(f, iterA, iterB, ...)`` returns
1020
+ a stream containing ``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]),
1021
+ f(iterA[2], iterB[2]), ...``::
1022
+
1023
+ itertools.imap(operator.add, [5, 6, 5], [1, 2, 3]) =>
1024
+ 6, 8, 8
1025
+
1026
+ The ``operator`` module contains a set of functions
1027
+ corresponding to Python's operators. Some examples are
1028
+ ``operator.add(a, b)`` (adds two values),
1029
+ ``operator.ne(a, b)`` (same as ``a!=b``),
1030
+ and
1031
+ ``operator.attrgetter('id')`` (returns a callable that
1032
+ fetches the ``"id"`` attribute).
1033
+
1034
+ ``itertools.starmap(func, iter)`` assumes that the iterable will
1035
+ return a stream of tuples, and calls ``f()`` using these tuples as the
1036
+ arguments::
1037
+
1038
+ itertools.starmap(os.path.join,
1039
+ [('/usr', 'bin', 'java'), ('/bin', 'python'),
1040
+ ('/usr', 'bin', 'perl'),('/usr', 'bin', 'ruby')])
1041
+ =>
1042
+ /usr/bin/java, /bin/python, /usr/bin/perl, /usr/bin/ruby
1043
+
1044
+
1045
+ Selecting elements
1046
+ ''''''''''''''''''
1047
+
1048
+ Another group of functions chooses a subset of an iterator's elements
1049
+ based on a predicate.
1050
+
1051
+ ``itertools.ifilter(predicate, iter)`` returns all the elements for
1052
+ which the predicate returns true::
1053
+
1054
+ def is_even(x):
1055
+ return (x % 2) == 0
1056
+
1057
+ itertools.ifilter(is_even, itertools.count()) =>
1058
+ 0, 2, 4, 6, 8, 10, 12, 14, ...
1059
+
1060
+ ``itertools.ifilterfalse(predicate, iter)`` is the opposite,
1061
+ returning all elements for which the predicate returns false::
1062
+
1063
+ itertools.ifilterfalse(is_even, itertools.count()) =>
1064
+ 1, 3, 5, 7, 9, 11, 13, 15, ...
1065
+
1066
+ ``itertools.takewhile(predicate, iter)`` returns elements for as long
1067
+ as the predicate returns true. Once the predicate returns false,
1068
+ the iterator will signal the end of its results.
1069
+
1070
+ ::
1071
+
1072
+ def less_than_10(x):
1073
+ return (x < 10)
1074
+
1075
+ itertools.takewhile(less_than_10, itertools.count()) =>
1076
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
1077
+
1078
+ itertools.takewhile(is_even, itertools.count()) =>
1079
+ 0
1080
+
1081
+ ``itertools.dropwhile(predicate, iter)`` discards elements while the
1082
+ predicate returns true, and then returns the rest of the iterable's
1083
+ results.
1084
+
1085
+ ::
1086
+
1087
+ itertools.dropwhile(less_than_10, itertools.count()) =>
1088
+ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
1089
+
1090
+ itertools.dropwhile(is_even, itertools.count()) =>
1091
+ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
1092
+
1093
+
1094
+ Grouping elements
1095
+ '''''''''''''''''
1096
+
1097
+ The last function I'll discuss, ``itertools.groupby(iter,
1098
+ key_func=None)``, is the most complicated. ``key_func(elem)`` is a
1099
+ function that can compute a key value for each element returned by the
1100
+ iterable. If you don't supply a key function, the key is simply each
1101
+ element itself.
1102
+
1103
+ ``groupby()`` collects all the consecutive elements from the
1104
+ underlying iterable that have the same key value, and returns a stream
1105
+ of 2-tuples containing a key value and an iterator for the elements
1106
+ with that key.
1107
+
1108
+ ::
1109
+
1110
+ city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
1111
+ ('Anchorage', 'AK'), ('Nome', 'AK'),
1112
+ ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
1113
+ ...
1114
+ ]
1115
+
1116
+ def get_state ((city, state)):
1117
+ return state
1118
+
1119
+ itertools.groupby(city_list, get_state) =>
1120
+ ('AL', iterator-1),
1121
+ ('AK', iterator-2),
1122
+ ('AZ', iterator-3), ...
1123
+
1124
+ where
1125
+ iterator-1 =>
1126
+ ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
1127
+ iterator-2 =>
1128
+ ('Anchorage', 'AK'), ('Nome', 'AK')
1129
+ iterator-3 =>
1130
+ ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
1131
+
1132
+ ``groupby()`` assumes that the underlying iterable's contents will
1133
+ already be sorted based on the key. Note that the returned iterators
1134
+ also use the underlying iterable, so you have to consume the results
1135
+ of iterator-1 before requesting iterator-2 and its corresponding key.
1136
+
1137
+
1138
+ The functools module
1139
+ ----------------------------------------------
1140
+
1141
+ The ``functools`` module in Python 2.5 contains some higher-order
1142
+ functions. A **higher-order function** takes one or more functions as
1143
+ input and returns a new function. The most useful tool in this module
1144
+ is the ``partial()`` function.
1145
+
1146
+ For programs written in a functional style, you'll sometimes want to
1147
+ construct variants of existing functions that have some of the
1148
+ parameters filled in. Consider a Python function ``f(a, b, c)``; you
1149
+ may wish to create a new function ``g(b, c)`` that's equivalent to
1150
+ ``f(1, b, c)``; you're filling in a value for one of ``f()``'s parameters.
1151
+ This is called "partial function application".
1152
+
1153
+ The constructor for ``partial`` takes the arguments ``(function, arg1,
1154
+ arg2, ... kwarg1=value1, kwarg2=value2)``. The resulting object is
1155
+ callable, so you can just call it to invoke ``function`` with the
1156
+ filled-in arguments.
1157
+
1158
+ Here's a small but realistic example::
1159
+
1160
+ import functools
1161
+
1162
+ def log (message, subsystem):
1163
+ "Write the contents of 'message' to the specified subsystem."
1164
+ print '%s: %s' % (subsystem, message)
1165
+ ...
1166
+
1167
+ server_log = functools.partial(log, subsystem='server')
1168
+ server_log('Unable to open socket')
1169
+
1170
+
1171
+ The operator module
1172
+ -------------------
1173
+
1174
+ The ``operator`` module was mentioned earlier. It contains a set of
1175
+ functions corresponding to Python's operators. These functions
1176
+ are often useful in functional-style code because they save you
1177
+ from writing trivial functions that perform a single operation.
1178
+
1179
+ Some of the functions in this module are:
1180
+
1181
+ * Math operations: ``add()``, ``sub()``, ``mul()``, ``div()``, ``floordiv()``,
1182
+ ``abs()``, ...
1183
+ * Logical operations: ``not_()``, ``truth()``.
1184
+ * Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
1185
+ * Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
1186
+ * Object identity: ``is_()``, ``is_not()``.
1187
+
1188
+ Consult `the operator module's documentation <http://docs.python.org/lib/module-operator.html>`__ for a complete
1189
+ list.
1190
+
1191
+
1192
+
1193
+ The functional module
1194
+ ---------------------
1195
+
1196
+ Collin Winter's `functional module <http://oakwinter.com/code/functional/>`__
1197
+ provides a number of more
1198
+ advanced tools for functional programming. It also reimplements
1199
+ several Python built-ins, trying to make them more intuitive to those
1200
+ used to functional programming in other languages.
1201
+
1202
+ This section contains an introduction to some of the most important
1203
+ functions in ``functional``; full documentation can be found at `the
1204
+ project's website <http://oakwinter.com/code/functional/documentation/>`__.
1205
+
1206
+ ``compose(outer, inner, unpack=False)``
1207
+
1208
+ The ``compose()`` function implements function composition.
1209
+ In other words, it returns a wrapper around the ``outer`` and ``inner`` callables, such
1210
+ that the return value from ``inner`` is fed directly to ``outer``. That is,
1211
+
1212
+ ::
1213
+
1214
+ >>> def add(a, b):
1215
+ ... return a + b
1216
+ ...
1217
+ >>> def double(a):
1218
+ ... return 2 * a
1219
+ ...
1220
+ >>> compose(double, add)(5, 6)
1221
+ 22
1222
+
1223
+ is equivalent to
1224
+
1225
+ ::
1226
+
1227
+ >>> double(add(5, 6))
1228
+ 22
1229
+
1230
+ The ``unpack`` keyword is provided to work around the fact that Python functions are not always
1231
+ `fully curried <http://en.wikipedia.org/wiki/Currying>`__.
1232
+ By default, it is expected that the ``inner`` function will return a single object and that the ``outer``
1233
+ function will take a single argument. Setting the ``unpack`` argument causes ``compose`` to expect a
1234
+ tuple from ``inner`` which will be expanded before being passed to ``outer``. Put simply,
1235
+
1236
+ ::
1237
+
1238
+ compose(f, g)(5, 6)
1239
+
1240
+ is equivalent to::
1241
+
1242
+ f(g(5, 6))
1243
+
1244
+ while
1245
+
1246
+ ::
1247
+
1248
+ compose(f, g, unpack=True)(5, 6)
1249
+
1250
+ is equivalent to::
1251
+
1252
+ f(*g(5, 6))
1253
+
1254
+ Even though ``compose()`` only accepts two functions, it's trivial to
1255
+ build up a version that will compose any number of functions. We'll
1256
+ use ``reduce()``, ``compose()`` and ``partial()`` (the last of which
1257
+ is provided by both ``functional`` and ``functools``).
1258
+
1259
+ ::
1260
+
1261
+ from functional import compose, partial
1262
+
1263
+ multi_compose = partial(reduce, compose)
1264
+
1265
+
1266
+ We can also use ``map()``, ``compose()`` and ``partial()`` to craft a
1267
+ version of ``"".join(...)`` that converts its arguments to string::
1268
+
1269
+ from functional import compose, partial
1270
+
1271
+ join = compose("".join, partial(map, str))
1272
+
1273
+
1274
+ ``flip(func)``
1275
+
1276
+ ``flip()`` wraps the callable in ``func`` and
1277
+ causes it to receive its non-keyword arguments in reverse order.
1278
+
1279
+ ::
1280
+
1281
+ >>> def triple(a, b, c):
1282
+ ... return (a, b, c)
1283
+ ...
1284
+ >>> triple(5, 6, 7)
1285
+ (5, 6, 7)
1286
+ >>>
1287
+ >>> flipped_triple = flip(triple)
1288
+ >>> flipped_triple(5, 6, 7)
1289
+ (7, 6, 5)
1290
+
1291
+ ``foldl(func, start, iterable)``
1292
+
1293
+ ``foldl()`` takes a binary function, a starting value (usually some kind of 'zero'), and an iterable.
1294
+ The function is applied to the starting value and the first element of the list, then the result of
1295
+ that and the second element of the list, then the result of that and the third element of the list,
1296
+ and so on.
1297
+
1298
+ This means that a call such as::
1299
+
1300
+ foldl(f, 0, [1, 2, 3])
1301
+
1302
+ is equivalent to::
1303
+
1304
+ f(f(f(0, 1), 2), 3)
1305
+
1306
+
1307
+ ``foldl()`` is roughly equivalent to the following recursive function::
1308
+
1309
+ def foldl(func, start, seq):
1310
+ if len(seq) == 0:
1311
+ return start
1312
+
1313
+ return foldl(func, func(start, seq[0]), seq[1:])
1314
+
1315
+ Speaking of equivalence, the above ``foldl`` call can be expressed in terms of the built-in ``reduce`` like
1316
+ so::
1317
+
1318
+ reduce(f, [1, 2, 3], 0)
1319
+
1320
+
1321
+ We can use ``foldl()``, ``operator.concat()`` and ``partial()`` to
1322
+ write a cleaner, more aesthetically-pleasing version of Python's
1323
+ ``"".join(...)`` idiom::
1324
+
1325
+ from functional import foldl, partial
1326
+ from operator import concat
1327
+
1328
+ join = partial(foldl, concat, "")
1329
+
1330
+
1331
+ Revision History and Acknowledgements
1332
+ ------------------------------------------------
1333
+
1334
+ The author would like to thank the following people for offering
1335
+ suggestions, corrections and assistance with various drafts of this
1336
+ article: Ian Bicking, Nick Coghlan, Nick Efford, Raymond Hettinger,
1337
+ Jim Jewett, Mike Krell, Leandro Lameiro, Jussi Salmela,
1338
+ Collin Winter, Blake Winton.
1339
+
1340
+ Version 0.1: posted June 30 2006.
1341
+
1342
+ Version 0.11: posted July 1 2006. Typo fixes.
1343
+
1344
+ Version 0.2: posted July 10 2006. Merged genexp and listcomp
1345
+ sections into one. Typo fixes.
1346
+
1347
+ Version 0.21: Added more references suggested on the tutor mailing list.
1348
+
1349
+ Version 0.30: Adds a section on the ``functional`` module written by
1350
+ Collin Winter; adds short section on the operator module; a few other
1351
+ edits.
1352
+
1353
+
1354
+ References
1355
+ --------------------
1356
+
1357
+ General
1358
+ '''''''''''''''
1359
+
1360
+ **Structure and Interpretation of Computer Programs**, by
1361
+ Harold Abelson and Gerald Jay Sussman with Julie Sussman.
1362
+ Full text at http://mitpress.mit.edu/sicp/.
1363
+ In this classic textbook of computer science, chapters 2 and 3 discuss the
1364
+ use of sequences and streams to organize the data flow inside a
1365
+ program. The book uses Scheme for its examples, but many of the
1366
+ design approaches described in these chapters are applicable to
1367
+ functional-style Python code.
1368
+
1369
+ http://www.defmacro.org/ramblings/fp.html: A general
1370
+ introduction to functional programming that uses Java examples
1371
+ and has a lengthy historical introduction.
1372
+
1373
+ http://en.wikipedia.org/wiki/Functional_programming:
1374
+ General Wikipedia entry describing functional programming.
1375
+
1376
+ http://en.wikipedia.org/wiki/Coroutine:
1377
+ Entry for coroutines.
1378
+
1379
+ http://en.wikipedia.org/wiki/Currying:
1380
+ Entry for the concept of currying.
1381
+
1382
+ Python-specific
1383
+ '''''''''''''''''''''''''''
1384
+
1385
+ http://gnosis.cx/TPiP/:
1386
+ The first chapter of David Mertz's book :title-reference:`Text Processing in Python`
1387
+ discusses functional programming for text processing, in the section titled
1388
+ "Utilizing Higher-Order Functions in Text Processing".
1389
+
1390
+ Mertz also wrote a 3-part series of articles on functional programming
1391
+ for IBM's DeveloperWorks site; see
1392
+ `part 1 <http://www-128.ibm.com/developerworks/library/l-prog.html>`__,
1393
+ `part 2 <http://www-128.ibm.com/developerworks/library/l-prog2.html>`__, and
1394
+ `part 3 <http://www-128.ibm.com/developerworks/linux/library/l-prog3.html>`__,
1395
+
1396
+
1397
+ Python documentation
1398
+ '''''''''''''''''''''''''''
1399
+
1400
+ http://docs.python.org/lib/module-itertools.html:
1401
+ Documentation for the ``itertools`` module.
1402
+
1403
+ http://docs.python.org/lib/module-operator.html:
1404
+ Documentation for the ``operator`` module.
1405
+
1406
+ http://www.python.org/dev/peps/pep-0289/:
1407
+ PEP 289: "Generator Expressions"
1408
+
1409
+ http://www.python.org/dev/peps/pep-0342/
1410
+ PEP 342: "Coroutines via Enhanced Generators" describes the new generator
1411
+ features in Python 2.5.
1412
+
1413
+ .. comment
1414
+
1415
+ Topics to place
1416
+ -----------------------------
1417
+
1418
+ XXX os.walk()
1419
+
1420
+ XXX Need a large example.
1421
+
1422
+ But will an example add much? I'll post a first draft and see
1423
+ what the comments say.
1424
+
1425
+ .. comment
1426
+
1427
+ Original outline:
1428
+ Introduction
1429
+ Idea of FP
1430
+ Programs built out of functions
1431
+ Functions are strictly input-output, no internal state
1432
+ Opposed to OO programming, where objects have state
1433
+
1434
+ Why FP?
1435
+ Formal provability
1436
+ Assignment is difficult to reason about
1437
+ Not very relevant to Python
1438
+ Modularity
1439
+ Small functions that do one thing
1440
+ Debuggability:
1441
+ Easy to test due to lack of state
1442
+ Easy to verify output from intermediate steps
1443
+ Composability
1444
+ You assemble a toolbox of functions that can be mixed
1445
+
1446
+ Tackling a problem
1447
+ Need a significant example
1448
+
1449
+ Iterators
1450
+ Generators
1451
+ The itertools module
1452
+ List comprehensions
1453
+ Small functions and the lambda statement
1454
+ Built-in functions
1455
+ map
1456
+ filter
1457
+ reduce
1458
+
1459
+ .. comment
1460
+
1461
+ Handy little function for printing part of an iterator -- used
1462
+ while writing this document.
1463
+
1464
+ import itertools
1465
+ def print_iter(it):
1466
+ slice = itertools.islice(it, 10)
1467
+ for elem in slice[:-1]:
1468
+ sys.stdout.write(str(elem))
1469
+ sys.stdout.write(', ')
1470
+ print elem[-1]
1471
+
1472
+