pointblank 0.19.0__py3-none-any.whl → 0.20.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (318) hide show
  1. pointblank/__init__.py +44 -1
  2. pointblank/_utils_llms_txt.py +20 -0
  3. pointblank/data/api-docs.txt +793 -1
  4. pointblank/field.py +1507 -0
  5. pointblank/generate/__init__.py +17 -0
  6. pointblank/generate/base.py +49 -0
  7. pointblank/generate/generators.py +573 -0
  8. pointblank/generate/regex.py +217 -0
  9. pointblank/locales/__init__.py +1476 -0
  10. pointblank/locales/data/AR/address.json +73 -0
  11. pointblank/locales/data/AR/company.json +60 -0
  12. pointblank/locales/data/AR/internet.json +19 -0
  13. pointblank/locales/data/AR/misc.json +7 -0
  14. pointblank/locales/data/AR/person.json +39 -0
  15. pointblank/locales/data/AR/text.json +38 -0
  16. pointblank/locales/data/AT/address.json +84 -0
  17. pointblank/locales/data/AT/company.json +65 -0
  18. pointblank/locales/data/AT/internet.json +20 -0
  19. pointblank/locales/data/AT/misc.json +8 -0
  20. pointblank/locales/data/AT/person.json +17 -0
  21. pointblank/locales/data/AT/text.json +35 -0
  22. pointblank/locales/data/AU/address.json +83 -0
  23. pointblank/locales/data/AU/company.json +65 -0
  24. pointblank/locales/data/AU/internet.json +20 -0
  25. pointblank/locales/data/AU/misc.json +8 -0
  26. pointblank/locales/data/AU/person.json +17 -0
  27. pointblank/locales/data/AU/text.json +35 -0
  28. pointblank/locales/data/BE/address.json +225 -0
  29. pointblank/locales/data/BE/company.json +129 -0
  30. pointblank/locales/data/BE/internet.json +36 -0
  31. pointblank/locales/data/BE/misc.json +6 -0
  32. pointblank/locales/data/BE/person.json +62 -0
  33. pointblank/locales/data/BE/text.json +38 -0
  34. pointblank/locales/data/BG/address.json +75 -0
  35. pointblank/locales/data/BG/company.json +60 -0
  36. pointblank/locales/data/BG/internet.json +19 -0
  37. pointblank/locales/data/BG/misc.json +7 -0
  38. pointblank/locales/data/BG/person.json +40 -0
  39. pointblank/locales/data/BG/text.json +38 -0
  40. pointblank/locales/data/BR/address.json +98 -0
  41. pointblank/locales/data/BR/company.json +65 -0
  42. pointblank/locales/data/BR/internet.json +20 -0
  43. pointblank/locales/data/BR/misc.json +8 -0
  44. pointblank/locales/data/BR/person.json +17 -0
  45. pointblank/locales/data/BR/text.json +35 -0
  46. pointblank/locales/data/CA/address.json +747 -0
  47. pointblank/locales/data/CA/company.json +120 -0
  48. pointblank/locales/data/CA/internet.json +24 -0
  49. pointblank/locales/data/CA/misc.json +11 -0
  50. pointblank/locales/data/CA/person.json +1033 -0
  51. pointblank/locales/data/CA/text.json +58 -0
  52. pointblank/locales/data/CH/address.json +184 -0
  53. pointblank/locales/data/CH/company.json +112 -0
  54. pointblank/locales/data/CH/internet.json +20 -0
  55. pointblank/locales/data/CH/misc.json +10 -0
  56. pointblank/locales/data/CH/person.json +64 -0
  57. pointblank/locales/data/CH/text.json +45 -0
  58. pointblank/locales/data/CL/address.json +71 -0
  59. pointblank/locales/data/CL/company.json +60 -0
  60. pointblank/locales/data/CL/internet.json +19 -0
  61. pointblank/locales/data/CL/misc.json +7 -0
  62. pointblank/locales/data/CL/person.json +38 -0
  63. pointblank/locales/data/CL/text.json +38 -0
  64. pointblank/locales/data/CN/address.json +124 -0
  65. pointblank/locales/data/CN/company.json +76 -0
  66. pointblank/locales/data/CN/internet.json +20 -0
  67. pointblank/locales/data/CN/misc.json +8 -0
  68. pointblank/locales/data/CN/person.json +50 -0
  69. pointblank/locales/data/CN/text.json +38 -0
  70. pointblank/locales/data/CO/address.json +76 -0
  71. pointblank/locales/data/CO/company.json +60 -0
  72. pointblank/locales/data/CO/internet.json +19 -0
  73. pointblank/locales/data/CO/misc.json +7 -0
  74. pointblank/locales/data/CO/person.json +38 -0
  75. pointblank/locales/data/CO/text.json +38 -0
  76. pointblank/locales/data/CY/address.json +62 -0
  77. pointblank/locales/data/CY/company.json +60 -0
  78. pointblank/locales/data/CY/internet.json +19 -0
  79. pointblank/locales/data/CY/misc.json +7 -0
  80. pointblank/locales/data/CY/person.json +38 -0
  81. pointblank/locales/data/CY/text.json +38 -0
  82. pointblank/locales/data/CZ/address.json +70 -0
  83. pointblank/locales/data/CZ/company.json +61 -0
  84. pointblank/locales/data/CZ/internet.json +19 -0
  85. pointblank/locales/data/CZ/misc.json +7 -0
  86. pointblank/locales/data/CZ/person.json +40 -0
  87. pointblank/locales/data/CZ/text.json +38 -0
  88. pointblank/locales/data/DE/address.json +756 -0
  89. pointblank/locales/data/DE/company.json +101 -0
  90. pointblank/locales/data/DE/internet.json +22 -0
  91. pointblank/locales/data/DE/misc.json +11 -0
  92. pointblank/locales/data/DE/person.json +1026 -0
  93. pointblank/locales/data/DE/text.json +50 -0
  94. pointblank/locales/data/DK/address.json +231 -0
  95. pointblank/locales/data/DK/company.json +65 -0
  96. pointblank/locales/data/DK/internet.json +20 -0
  97. pointblank/locales/data/DK/misc.json +7 -0
  98. pointblank/locales/data/DK/person.json +45 -0
  99. pointblank/locales/data/DK/text.json +43 -0
  100. pointblank/locales/data/EE/address.json +69 -0
  101. pointblank/locales/data/EE/company.json +60 -0
  102. pointblank/locales/data/EE/internet.json +19 -0
  103. pointblank/locales/data/EE/misc.json +7 -0
  104. pointblank/locales/data/EE/person.json +39 -0
  105. pointblank/locales/data/EE/text.json +38 -0
  106. pointblank/locales/data/ES/address.json +3086 -0
  107. pointblank/locales/data/ES/company.json +644 -0
  108. pointblank/locales/data/ES/internet.json +25 -0
  109. pointblank/locales/data/ES/misc.json +11 -0
  110. pointblank/locales/data/ES/person.json +488 -0
  111. pointblank/locales/data/ES/text.json +49 -0
  112. pointblank/locales/data/FI/address.json +93 -0
  113. pointblank/locales/data/FI/company.json +65 -0
  114. pointblank/locales/data/FI/internet.json +20 -0
  115. pointblank/locales/data/FI/misc.json +8 -0
  116. pointblank/locales/data/FI/person.json +17 -0
  117. pointblank/locales/data/FI/text.json +35 -0
  118. pointblank/locales/data/FR/address.json +619 -0
  119. pointblank/locales/data/FR/company.json +111 -0
  120. pointblank/locales/data/FR/internet.json +22 -0
  121. pointblank/locales/data/FR/misc.json +11 -0
  122. pointblank/locales/data/FR/person.json +1066 -0
  123. pointblank/locales/data/FR/text.json +50 -0
  124. pointblank/locales/data/GB/address.json +5759 -0
  125. pointblank/locales/data/GB/company.json +131 -0
  126. pointblank/locales/data/GB/internet.json +24 -0
  127. pointblank/locales/data/GB/misc.json +45 -0
  128. pointblank/locales/data/GB/person.json +578 -0
  129. pointblank/locales/data/GB/text.json +61 -0
  130. pointblank/locales/data/GR/address.json +68 -0
  131. pointblank/locales/data/GR/company.json +61 -0
  132. pointblank/locales/data/GR/internet.json +19 -0
  133. pointblank/locales/data/GR/misc.json +7 -0
  134. pointblank/locales/data/GR/person.json +39 -0
  135. pointblank/locales/data/GR/text.json +38 -0
  136. pointblank/locales/data/HK/address.json +79 -0
  137. pointblank/locales/data/HK/company.json +69 -0
  138. pointblank/locales/data/HK/internet.json +19 -0
  139. pointblank/locales/data/HK/misc.json +7 -0
  140. pointblank/locales/data/HK/person.json +42 -0
  141. pointblank/locales/data/HK/text.json +38 -0
  142. pointblank/locales/data/HR/address.json +73 -0
  143. pointblank/locales/data/HR/company.json +60 -0
  144. pointblank/locales/data/HR/internet.json +19 -0
  145. pointblank/locales/data/HR/misc.json +7 -0
  146. pointblank/locales/data/HR/person.json +38 -0
  147. pointblank/locales/data/HR/text.json +38 -0
  148. pointblank/locales/data/HU/address.json +70 -0
  149. pointblank/locales/data/HU/company.json +61 -0
  150. pointblank/locales/data/HU/internet.json +19 -0
  151. pointblank/locales/data/HU/misc.json +7 -0
  152. pointblank/locales/data/HU/person.json +40 -0
  153. pointblank/locales/data/HU/text.json +38 -0
  154. pointblank/locales/data/ID/address.json +68 -0
  155. pointblank/locales/data/ID/company.json +61 -0
  156. pointblank/locales/data/ID/internet.json +19 -0
  157. pointblank/locales/data/ID/misc.json +7 -0
  158. pointblank/locales/data/ID/person.json +40 -0
  159. pointblank/locales/data/ID/text.json +38 -0
  160. pointblank/locales/data/IE/address.json +643 -0
  161. pointblank/locales/data/IE/company.json +140 -0
  162. pointblank/locales/data/IE/internet.json +24 -0
  163. pointblank/locales/data/IE/misc.json +44 -0
  164. pointblank/locales/data/IE/person.json +55 -0
  165. pointblank/locales/data/IE/text.json +60 -0
  166. pointblank/locales/data/IN/address.json +92 -0
  167. pointblank/locales/data/IN/company.json +65 -0
  168. pointblank/locales/data/IN/internet.json +20 -0
  169. pointblank/locales/data/IN/misc.json +8 -0
  170. pointblank/locales/data/IN/person.json +52 -0
  171. pointblank/locales/data/IN/text.json +39 -0
  172. pointblank/locales/data/IS/address.json +63 -0
  173. pointblank/locales/data/IS/company.json +61 -0
  174. pointblank/locales/data/IS/internet.json +19 -0
  175. pointblank/locales/data/IS/misc.json +7 -0
  176. pointblank/locales/data/IS/person.json +44 -0
  177. pointblank/locales/data/IS/text.json +38 -0
  178. pointblank/locales/data/IT/address.json +192 -0
  179. pointblank/locales/data/IT/company.json +137 -0
  180. pointblank/locales/data/IT/internet.json +20 -0
  181. pointblank/locales/data/IT/misc.json +10 -0
  182. pointblank/locales/data/IT/person.json +70 -0
  183. pointblank/locales/data/IT/text.json +44 -0
  184. pointblank/locales/data/JP/address.json +713 -0
  185. pointblank/locales/data/JP/company.json +113 -0
  186. pointblank/locales/data/JP/internet.json +22 -0
  187. pointblank/locales/data/JP/misc.json +10 -0
  188. pointblank/locales/data/JP/person.json +1057 -0
  189. pointblank/locales/data/JP/text.json +51 -0
  190. pointblank/locales/data/KR/address.json +77 -0
  191. pointblank/locales/data/KR/company.json +68 -0
  192. pointblank/locales/data/KR/internet.json +19 -0
  193. pointblank/locales/data/KR/misc.json +7 -0
  194. pointblank/locales/data/KR/person.json +40 -0
  195. pointblank/locales/data/KR/text.json +38 -0
  196. pointblank/locales/data/LT/address.json +66 -0
  197. pointblank/locales/data/LT/company.json +60 -0
  198. pointblank/locales/data/LT/internet.json +19 -0
  199. pointblank/locales/data/LT/misc.json +7 -0
  200. pointblank/locales/data/LT/person.json +42 -0
  201. pointblank/locales/data/LT/text.json +38 -0
  202. pointblank/locales/data/LU/address.json +66 -0
  203. pointblank/locales/data/LU/company.json +60 -0
  204. pointblank/locales/data/LU/internet.json +19 -0
  205. pointblank/locales/data/LU/misc.json +7 -0
  206. pointblank/locales/data/LU/person.json +38 -0
  207. pointblank/locales/data/LU/text.json +38 -0
  208. pointblank/locales/data/LV/address.json +62 -0
  209. pointblank/locales/data/LV/company.json +60 -0
  210. pointblank/locales/data/LV/internet.json +19 -0
  211. pointblank/locales/data/LV/misc.json +7 -0
  212. pointblank/locales/data/LV/person.json +40 -0
  213. pointblank/locales/data/LV/text.json +38 -0
  214. pointblank/locales/data/MT/address.json +61 -0
  215. pointblank/locales/data/MT/company.json +60 -0
  216. pointblank/locales/data/MT/internet.json +19 -0
  217. pointblank/locales/data/MT/misc.json +7 -0
  218. pointblank/locales/data/MT/person.json +38 -0
  219. pointblank/locales/data/MT/text.json +38 -0
  220. pointblank/locales/data/MX/address.json +100 -0
  221. pointblank/locales/data/MX/company.json +65 -0
  222. pointblank/locales/data/MX/internet.json +20 -0
  223. pointblank/locales/data/MX/misc.json +8 -0
  224. pointblank/locales/data/MX/person.json +18 -0
  225. pointblank/locales/data/MX/text.json +39 -0
  226. pointblank/locales/data/NL/address.json +1517 -0
  227. pointblank/locales/data/NL/company.json +133 -0
  228. pointblank/locales/data/NL/internet.json +44 -0
  229. pointblank/locales/data/NL/misc.json +55 -0
  230. pointblank/locales/data/NL/person.json +365 -0
  231. pointblank/locales/data/NL/text.json +210 -0
  232. pointblank/locales/data/NO/address.json +86 -0
  233. pointblank/locales/data/NO/company.json +66 -0
  234. pointblank/locales/data/NO/internet.json +20 -0
  235. pointblank/locales/data/NO/misc.json +8 -0
  236. pointblank/locales/data/NO/person.json +17 -0
  237. pointblank/locales/data/NO/text.json +35 -0
  238. pointblank/locales/data/NZ/address.json +90 -0
  239. pointblank/locales/data/NZ/company.json +65 -0
  240. pointblank/locales/data/NZ/internet.json +20 -0
  241. pointblank/locales/data/NZ/misc.json +8 -0
  242. pointblank/locales/data/NZ/person.json +17 -0
  243. pointblank/locales/data/NZ/text.json +39 -0
  244. pointblank/locales/data/PH/address.json +67 -0
  245. pointblank/locales/data/PH/company.json +61 -0
  246. pointblank/locales/data/PH/internet.json +19 -0
  247. pointblank/locales/data/PH/misc.json +7 -0
  248. pointblank/locales/data/PH/person.json +40 -0
  249. pointblank/locales/data/PH/text.json +38 -0
  250. pointblank/locales/data/PL/address.json +91 -0
  251. pointblank/locales/data/PL/company.json +65 -0
  252. pointblank/locales/data/PL/internet.json +20 -0
  253. pointblank/locales/data/PL/misc.json +8 -0
  254. pointblank/locales/data/PL/person.json +17 -0
  255. pointblank/locales/data/PL/text.json +35 -0
  256. pointblank/locales/data/PT/address.json +90 -0
  257. pointblank/locales/data/PT/company.json +65 -0
  258. pointblank/locales/data/PT/internet.json +20 -0
  259. pointblank/locales/data/PT/misc.json +8 -0
  260. pointblank/locales/data/PT/person.json +17 -0
  261. pointblank/locales/data/PT/text.json +35 -0
  262. pointblank/locales/data/RO/address.json +73 -0
  263. pointblank/locales/data/RO/company.json +61 -0
  264. pointblank/locales/data/RO/internet.json +19 -0
  265. pointblank/locales/data/RO/misc.json +7 -0
  266. pointblank/locales/data/RO/person.json +40 -0
  267. pointblank/locales/data/RO/text.json +38 -0
  268. pointblank/locales/data/RU/address.json +74 -0
  269. pointblank/locales/data/RU/company.json +60 -0
  270. pointblank/locales/data/RU/internet.json +19 -0
  271. pointblank/locales/data/RU/misc.json +7 -0
  272. pointblank/locales/data/RU/person.json +38 -0
  273. pointblank/locales/data/RU/text.json +38 -0
  274. pointblank/locales/data/SE/address.json +247 -0
  275. pointblank/locales/data/SE/company.json +65 -0
  276. pointblank/locales/data/SE/internet.json +20 -0
  277. pointblank/locales/data/SE/misc.json +7 -0
  278. pointblank/locales/data/SE/person.json +45 -0
  279. pointblank/locales/data/SE/text.json +43 -0
  280. pointblank/locales/data/SI/address.json +67 -0
  281. pointblank/locales/data/SI/company.json +60 -0
  282. pointblank/locales/data/SI/internet.json +19 -0
  283. pointblank/locales/data/SI/misc.json +7 -0
  284. pointblank/locales/data/SI/person.json +38 -0
  285. pointblank/locales/data/SI/text.json +38 -0
  286. pointblank/locales/data/SK/address.json +64 -0
  287. pointblank/locales/data/SK/company.json +60 -0
  288. pointblank/locales/data/SK/internet.json +19 -0
  289. pointblank/locales/data/SK/misc.json +7 -0
  290. pointblank/locales/data/SK/person.json +38 -0
  291. pointblank/locales/data/SK/text.json +38 -0
  292. pointblank/locales/data/TR/address.json +105 -0
  293. pointblank/locales/data/TR/company.json +65 -0
  294. pointblank/locales/data/TR/internet.json +20 -0
  295. pointblank/locales/data/TR/misc.json +8 -0
  296. pointblank/locales/data/TR/person.json +17 -0
  297. pointblank/locales/data/TR/text.json +35 -0
  298. pointblank/locales/data/TW/address.json +86 -0
  299. pointblank/locales/data/TW/company.json +69 -0
  300. pointblank/locales/data/TW/internet.json +19 -0
  301. pointblank/locales/data/TW/misc.json +7 -0
  302. pointblank/locales/data/TW/person.json +42 -0
  303. pointblank/locales/data/TW/text.json +38 -0
  304. pointblank/locales/data/US/address.json +996 -0
  305. pointblank/locales/data/US/company.json +131 -0
  306. pointblank/locales/data/US/internet.json +22 -0
  307. pointblank/locales/data/US/misc.json +11 -0
  308. pointblank/locales/data/US/person.json +1092 -0
  309. pointblank/locales/data/US/text.json +56 -0
  310. pointblank/locales/data/_shared/misc.json +42 -0
  311. pointblank/schema.py +339 -2
  312. {pointblank-0.19.0.dist-info → pointblank-0.20.0.dist-info}/METADATA +45 -1
  313. pointblank-0.20.0.dist-info/RECORD +366 -0
  314. {pointblank-0.19.0.dist-info → pointblank-0.20.0.dist-info}/WHEEL +1 -1
  315. pointblank-0.19.0.dist-info/RECORD +0 -59
  316. {pointblank-0.19.0.dist-info → pointblank-0.20.0.dist-info}/entry_points.txt +0 -0
  317. {pointblank-0.19.0.dist-info → pointblank-0.20.0.dist-info}/licenses/LICENSE +0 -0
  318. {pointblank-0.19.0.dist-info → pointblank-0.20.0.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,56 @@
1
+ {
2
+ "words": [
3
+ "lorem", "ipsum", "dolor", "sit", "amet", "consectetur", "adipiscing", "elit",
4
+ "sed", "do", "eiusmod", "tempor", "incididunt", "ut", "labore", "et", "dolore",
5
+ "magna", "aliqua", "enim", "ad", "minim", "veniam", "quis", "nostrud",
6
+ "exercitation", "ullamco", "laboris", "nisi", "aliquip", "ex", "ea", "commodo",
7
+ "consequat", "duis", "aute", "irure", "in", "reprehenderit", "voluptate",
8
+ "velit", "esse", "cillum", "fugiat", "nulla", "pariatur", "excepteur", "sint",
9
+ "occaecat", "cupidatat", "non", "proident", "sunt", "culpa", "qui", "officia",
10
+ "deserunt", "mollit", "anim", "id", "est", "laborum", "the", "and", "for",
11
+ "are", "but", "not", "you", "all", "can", "had", "her", "was", "one", "our",
12
+ "out", "day", "get", "has", "him", "his", "how", "its", "may", "new", "now",
13
+ "old", "see", "two", "way", "who", "boy", "did", "own", "say", "she", "too",
14
+ "use", "time", "very", "when", "come", "could", "make", "like", "back", "only",
15
+ "over", "such", "year", "into", "just", "most", "also", "been", "call", "from",
16
+ "have", "more", "made", "find", "long", "down", "look", "many", "then", "them",
17
+ "well", "would", "about", "after", "being", "could", "first", "great", "little",
18
+ "might", "never", "other", "right", "still", "their", "there", "these", "thing",
19
+ "think", "those", "three", "today", "under", "water", "where", "which", "while",
20
+ "world", "write", "years"
21
+ ],
22
+ "sentence_patterns": [
23
+ "The {adjective} {noun} {verb} the {adjective} {noun}.",
24
+ "A {noun} {verb} {adverb} in the {noun}.",
25
+ "{proper_noun} {verb} the {adjective} {noun} {adverb}."
26
+ ],
27
+ "adjectives": [
28
+ "quick", "brown", "lazy", "fast", "slow", "big", "small", "old", "new", "young",
29
+ "bright", "dark", "loud", "quiet", "hot", "cold", "warm", "cool", "soft", "hard",
30
+ "smooth", "rough", "clean", "dirty", "happy", "sad", "angry", "calm", "brave",
31
+ "shy", "clever", "foolish", "kind", "mean", "rich", "poor", "strong", "weak",
32
+ "tall", "short", "wide", "narrow", "deep", "shallow", "thick", "thin", "heavy",
33
+ "light", "full", "empty", "open", "closed", "wet", "dry", "fresh", "stale"
34
+ ],
35
+ "nouns": [
36
+ "dog", "cat", "bird", "fish", "tree", "house", "car", "book", "table", "chair",
37
+ "door", "window", "road", "river", "mountain", "ocean", "sky", "sun", "moon",
38
+ "star", "cloud", "rain", "snow", "wind", "fire", "water", "earth", "air",
39
+ "stone", "wood", "metal", "glass", "paper", "cloth", "food", "drink", "money",
40
+ "time", "place", "person", "child", "man", "woman", "boy", "girl", "friend",
41
+ "family", "school", "work", "city", "country", "world", "life", "day", "night",
42
+ "morning", "evening", "week", "month", "year", "story", "song", "game", "sport"
43
+ ],
44
+ "verbs": [
45
+ "runs", "jumps", "walks", "talks", "sees", "hears", "feels", "thinks", "knows",
46
+ "wants", "needs", "likes", "loves", "hates", "finds", "gives", "takes", "makes",
47
+ "says", "tells", "asks", "answers", "helps", "works", "plays", "reads", "writes",
48
+ "draws", "sings", "dances", "cooks", "eats", "drinks", "sleeps", "wakes", "opens",
49
+ "closes", "starts", "stops", "begins", "ends", "moves", "stays", "comes", "goes"
50
+ ],
51
+ "adverbs": [
52
+ "quickly", "slowly", "carefully", "happily", "sadly", "loudly", "quietly",
53
+ "easily", "hardly", "always", "never", "often", "sometimes", "usually",
54
+ "really", "very", "quite", "almost", "already", "still", "just", "only"
55
+ ]
56
+ }
@@ -0,0 +1,42 @@
1
+ {
2
+ "file_extensions": [
3
+ "txt", "pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx", "csv", "json",
4
+ "xml", "html", "css", "js", "py", "java", "cpp", "h", "md", "rst",
5
+ "png", "jpg", "jpeg", "gif", "svg", "bmp", "ico", "webp", "tiff",
6
+ "mp3", "wav", "ogg", "flac", "aac", "wma",
7
+ "mp4", "avi", "mkv", "mov", "wmv", "flv", "webm",
8
+ "zip", "tar", "gz", "rar", "7z", "bz2",
9
+ "sql", "db", "sqlite", "tsv", "yaml", "yml",
10
+ "c", "cs", "rb", "php", "go", "rs", "ts", "tsx", "jsx"
11
+ ],
12
+ "mime_types": [
13
+ "text/plain", "text/html", "text/css", "text/javascript", "text/csv",
14
+ "application/json", "application/xml", "application/pdf",
15
+ "application/zip", "application/gzip", "application/x-tar",
16
+ "application/msword", "application/vnd.ms-excel", "application/vnd.ms-powerpoint",
17
+ "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
18
+ "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
19
+ "application/vnd.openxmlformats-officedocument.presentationml.presentation",
20
+ "image/png", "image/jpeg", "image/gif", "image/svg+xml", "image/webp", "image/bmp",
21
+ "audio/mpeg", "audio/wav", "audio/ogg", "audio/flac", "audio/aac",
22
+ "video/mp4", "video/webm", "video/x-msvideo", "video/quicktime", "video/x-matroska",
23
+ "application/octet-stream", "multipart/form-data"
24
+ ],
25
+ "currency_codes": [
26
+ "AED", "AFN", "ALL", "AMD", "ANG", "AOA", "ARS", "AUD", "AWG", "AZN",
27
+ "BAM", "BBD", "BDT", "BGN", "BHD", "BIF", "BMD", "BND", "BOB", "BRL",
28
+ "BSD", "BTN", "BWP", "BYN", "BZD", "CAD", "CDF", "CHF", "CLP", "CNY",
29
+ "COP", "CRC", "CUP", "CVE", "CZK", "DJF", "DKK", "DOP", "DZD", "EGP",
30
+ "ERN", "ETB", "EUR", "FJD", "FKP", "GBP", "GEL", "GHS", "GIP", "GMD",
31
+ "GNF", "GTQ", "GYD", "HKD", "HNL", "HTG", "HUF", "IDR", "ILS", "INR",
32
+ "IQD", "IRR", "ISK", "JMD", "JOD", "JPY", "KES", "KGS", "KHR", "KMF",
33
+ "KPW", "KRW", "KWD", "KYD", "KZT", "LAK", "LBP", "LKR", "LRD", "LSL",
34
+ "LYD", "MAD", "MDL", "MGA", "MKD", "MMK", "MNT", "MOP", "MRU", "MUR",
35
+ "MVR", "MWK", "MXN", "MYR", "MZN", "NAD", "NGN", "NIO", "NOK", "NPR",
36
+ "NZD", "OMR", "PAB", "PEN", "PGK", "PHP", "PKR", "PLN", "PYG", "QAR",
37
+ "RON", "RSD", "RUB", "RWF", "SAR", "SBD", "SCR", "SDG", "SEK", "SGD",
38
+ "SHP", "SLE", "SOS", "SRD", "SSP", "STN", "SVC", "SYP", "SZL", "THB",
39
+ "TJS", "TMT", "TND", "TOP", "TRY", "TTD", "TWD", "TZS", "UAH", "UGX",
40
+ "USD", "UYU", "UZS", "VES", "VND", "VUV", "WST", "YER", "ZAR", "ZMW", "ZWL"
41
+ ]
42
+ }
pointblank/schema.py CHANGED
@@ -2,7 +2,7 @@ from __future__ import annotations
2
2
 
3
3
  import copy
4
4
  from dataclasses import dataclass
5
- from typing import TYPE_CHECKING
5
+ from typing import TYPE_CHECKING, Literal
6
6
 
7
7
  import narwhals as nw
8
8
 
@@ -12,7 +12,9 @@ from pointblank._utils import _get_tbl_type, _is_lazy_frame, _is_lib_present, _i
12
12
  if TYPE_CHECKING:
13
13
  from typing import Any
14
14
 
15
- __all__ = ["Schema", "_check_schema_match"]
15
+ from pointblank.field import Field
16
+
17
+ __all__ = ["Schema", "generate_dataset", "_check_schema_match"]
16
18
 
17
19
 
18
20
  @dataclass
@@ -789,6 +791,230 @@ class Schema:
789
791
  def __repr__(self):
790
792
  return f"Schema(columns={self.columns})"
791
793
 
794
+ def generate(
795
+ self,
796
+ n: int = 100,
797
+ seed: int | None = None,
798
+ output: Literal["polars", "pandas", "dict"] = "polars",
799
+ country: str = "US",
800
+ ) -> Any:
801
+ """
802
+ Generate synthetic test data conforming to this schema.
803
+
804
+ This method generates random data that matches the schema's column definitions. When the
805
+ schema is defined using `Field` objects with constraints (e.g., `min_val`, `max_val`,
806
+ `pattern`, `preset`), the generated data will respect those constraints.
807
+
808
+ Parameters
809
+ ----------
810
+ n
811
+ Number of rows to generate. Default is `100`.
812
+ seed
813
+ Random seed for reproducibility. If provided, the same seed will produce
814
+ the same data. Default is `None` (non-deterministic).
815
+ output
816
+ Output format for the generated data. Options are: (1) `"polars"` (default) returns a
817
+ Polars DataFrame, (2) `"pandas"` returns a Pandas DataFrame, and (3) `"dict"` returns
818
+ a dictionary of lists.
819
+ country
820
+ Country code for realistic data generation when using presets (e.g., `preset="email"`,
821
+ `preset="address"`). Accepts ISO 3166-1 alpha-2 codes (e.g., `"US"`, `"DE"`, `"FR"`)
822
+ or alpha-3 codes (e.g., `"USA"`, `"DEU"`, `"FRA"`). Default is `"US"`.
823
+
824
+ Returns
825
+ -------
826
+ DataFrame or dict
827
+ Generated data in the requested format.
828
+
829
+ Raises
830
+ ------
831
+ ValueError
832
+ If the schema has no columns or if constraints cannot be satisfied.
833
+ ImportError
834
+ If required optional dependencies are not installed.
835
+
836
+ Supported Countries
837
+ -------------------
838
+ The `country=` parameter controls the country used for generating realistic data with
839
+ presets (e.g., `preset="email"`, `preset="address"`). This affects location-specific
840
+ formats like addresses, phone numbers, and postal codes. Currently, **50 countries** are
841
+ supported with full locale data:
842
+
843
+ **Europe (32 countries):** Austria (`"AT"`), Belgium (`"BE"`), Bulgaria (`"BG"`),
844
+ Croatia (`"HR"`), Cyprus (`"CY"`), Czech Republic (`"CZ"`), Denmark (`"DK"`),
845
+ Estonia (`"EE"`), Finland (`"FI"`), France (`"FR"`), Germany (`"DE"`), Greece (`"GR"`),
846
+ Hungary (`"HU"`), Iceland (`"IS"`), Ireland (`"IE"`), Italy (`"IT"`), Latvia (`"LV"`),
847
+ Lithuania (`"LT"`), Luxembourg (`"LU"`), Malta (`"MT"`), Netherlands (`"NL"`),
848
+ Norway (`"NO"`), Poland (`"PL"`), Portugal (`"PT"`), Romania (`"RO"`), Russia (`"RU"`),
849
+ Slovakia (`"SK"`), Slovenia (`"SI"`), Spain (`"ES"`), Sweden (`"SE"`),
850
+ Switzerland (`"CH"`), United Kingdom (`"GB"`)
851
+
852
+ **Americas (7 countries):** Argentina (`"AR"`), Brazil (`"BR"`), Canada (`"CA"`),
853
+ Chile (`"CL"`), Colombia (`"CO"`), Mexico (`"MX"`), United States (`"US"`)
854
+
855
+ **Asia-Pacific (10 countries):** Australia (`"AU"`), China (`"CN"`), Hong Kong (`"HK"`),
856
+ India (`"IN"`), Indonesia (`"ID"`), Japan (`"JP"`), New Zealand (`"NZ"`),
857
+ Philippines (`"PH"`), South Korea (`"KR"`), Taiwan (`"TW"`)
858
+
859
+ **Middle East (1 country):** Turkey (`"TR"`)
860
+
861
+ Examples
862
+ --------
863
+ Using `pb.Schema` we first put together a schema with field constraints:
864
+
865
+ ```{python}
866
+ import pointblank as pb
867
+
868
+ schema = pb.Schema(
869
+ user_id=pb.int_field(min_val=1, unique=True),
870
+ email=pb.string_field(preset="email"),
871
+ age=pb.int_field(min_val=18, max_val=100),
872
+ status=pb.string_field(allowed=["active", "pending", "inactive"]),
873
+ )
874
+ ```
875
+
876
+ With the `generate()` method, we can obtain a set number of rows of generated data:
877
+
878
+ ```{python}
879
+ # Generate 100 rows of test data
880
+ pb.preview(schema.generate(n=100, seed=23))
881
+ ```
882
+
883
+ It's possible to generate data from a simple dtype-only schema:
884
+
885
+ ```{python}
886
+ schema = pb.Schema(name="String", age="Int64", active="Boolean")
887
+ pb.preview(schema.generate(n=50, seed=123, output="pandas"))
888
+ ```
889
+
890
+ We can obtain synthetic data with German addresses using presets for person name and city of
891
+ residence. Note the use of `country="DE"` in the `generate()` call:
892
+
893
+ ```{python}
894
+ schema = pb.Schema(
895
+ name=pb.string_field(preset="name"),
896
+ city=pb.string_field(preset="city"),
897
+ )
898
+ pb.preview(schema.generate(n=20, seed=23, country="DE"))
899
+ ```
900
+ """
901
+ from pointblank.field import Field
902
+ from pointblank.generate import GeneratorConfig, generate_dataframe
903
+
904
+ if self.columns is None or len(self.columns) == 0:
905
+ raise ValueError("Cannot generate data from an empty schema.")
906
+
907
+ # Convert schema columns to Field objects
908
+ fields: dict[str, Field] = {}
909
+
910
+ for col_tuple in self.columns:
911
+ col_name = col_tuple[0]
912
+
913
+ # Check if the value is already a Field object
914
+ if len(col_tuple) > 1 and isinstance(col_tuple[1], Field):
915
+ fields[col_name] = col_tuple[1]
916
+ elif len(col_tuple) > 1:
917
+ # Simple dtype string - convert to basic Field
918
+ dtype_str = col_tuple[1]
919
+ fields[col_name] = _dtype_string_to_field(dtype_str)
920
+ else:
921
+ # No dtype specified - default to String
922
+ fields[col_name] = Field(dtype="String")
923
+
924
+ # Create generator config
925
+ config = GeneratorConfig(
926
+ n=n,
927
+ seed=seed,
928
+ output=output,
929
+ country=country,
930
+ )
931
+
932
+ return generate_dataframe(fields, config)
933
+
934
+
935
+ def _dtype_string_to_field(dtype_str: str) -> "Field":
936
+ """
937
+ Convert a dtype string to a basic Field object.
938
+
939
+ This handles the mapping from various dtype string formats to
940
+ standardized Field dtypes.
941
+ """
942
+ from pointblank.field import (
943
+ BoolField,
944
+ DateField,
945
+ DatetimeField,
946
+ DurationField,
947
+ FloatField,
948
+ IntField,
949
+ StringField,
950
+ TimeField,
951
+ )
952
+
953
+ # Normalize dtype string
954
+ dtype_lower = dtype_str.lower()
955
+
956
+ # Map common dtype strings to Field classes and dtypes
957
+ dtype_mapping = {
958
+ # Integer types
959
+ "int8": ("int", "Int8"),
960
+ "int16": ("int", "Int16"),
961
+ "int32": ("int", "Int32"),
962
+ "int64": ("int", "Int64"),
963
+ "int": ("int", "Int64"),
964
+ "integer": ("int", "Int64"),
965
+ "uint8": ("int", "UInt8"),
966
+ "uint16": ("int", "UInt16"),
967
+ "uint32": ("int", "UInt32"),
968
+ "uint64": ("int", "UInt64"),
969
+ # Float types
970
+ "float32": ("float", "Float32"),
971
+ "float64": ("float", "Float64"),
972
+ "float": ("float", "Float64"),
973
+ "double": ("float", "Float64"),
974
+ # String types
975
+ "string": ("string", "String"),
976
+ "str": ("string", "String"),
977
+ "utf8": ("string", "String"),
978
+ "object": ("string", "String"),
979
+ # Boolean
980
+ "bool": ("bool", "Boolean"),
981
+ "boolean": ("bool", "Boolean"),
982
+ # Date/time types
983
+ "date": ("date", "Date"),
984
+ "datetime": ("datetime", "Datetime"),
985
+ "timestamp": ("datetime", "Datetime"),
986
+ "time": ("time", "Time"),
987
+ "duration": ("duration", "Duration"),
988
+ "timedelta": ("duration", "Duration"),
989
+ }
990
+
991
+ # Field class mapping
992
+ field_classes = {
993
+ "int": IntField,
994
+ "float": FloatField,
995
+ "string": StringField,
996
+ "bool": BoolField,
997
+ "date": DateField,
998
+ "datetime": DatetimeField,
999
+ "time": TimeField,
1000
+ "duration": DurationField,
1001
+ }
1002
+
1003
+ # Try direct mapping first
1004
+ if dtype_lower in dtype_mapping:
1005
+ field_type, dtype = dtype_mapping[dtype_lower]
1006
+ field_class = field_classes[field_type]
1007
+ return field_class(dtype=dtype)
1008
+
1009
+ # Try to match partial strings (e.g., "datetime64[ns]" -> "Datetime")
1010
+ for key, (field_type, dtype) in dtype_mapping.items():
1011
+ if key in dtype_lower:
1012
+ field_class = field_classes[field_type]
1013
+ return field_class(dtype=dtype)
1014
+
1015
+ # Default to StringField for unknown types
1016
+ return StringField()
1017
+
792
1018
 
793
1019
  def _process_columns(
794
1020
  *,
@@ -1302,3 +1528,114 @@ def _check_schema_match(
1302
1528
  )
1303
1529
 
1304
1530
  return res
1531
+
1532
+
1533
+ def generate_dataset(
1534
+ schema: Schema,
1535
+ n: int = 100,
1536
+ seed: int | None = None,
1537
+ output: Literal["polars", "pandas", "dict"] = "polars",
1538
+ country: str = "US",
1539
+ ) -> Any:
1540
+ """
1541
+ Generate synthetic test data from a schema.
1542
+
1543
+ This function generates random data that conforms to a schema's column definitions. When the
1544
+ schema is defined using `Field` objects with constraints (e.g., `min_val`, `max_val`,
1545
+ `pattern`, `preset`), the generated data will respect those constraints.
1546
+
1547
+ This is a convenience function that wraps `Schema.generate()` for a more functional style
1548
+ of usage, similar to how `load_dataset()` loads built-in datasets.
1549
+
1550
+ Parameters
1551
+ ----------
1552
+ schema
1553
+ The schema object defining the structure and constraints of the data to generate.
1554
+ n
1555
+ Number of rows to generate. Default is `100`.
1556
+ seed
1557
+ Random seed for reproducibility. If provided, the same seed will produce
1558
+ the same data. Default is `None` (non-deterministic).
1559
+ output
1560
+ Output format for the generated data. Options are: (1) `"polars"` (default) returns a
1561
+ Polars DataFrame, (2) `"pandas"` returns a Pandas DataFrame, and (3) `"dict"` returns
1562
+ a dictionary of lists.
1563
+ country
1564
+ Country code for realistic data generation when using presets (e.g., `preset="email"`,
1565
+ `preset="address"`). Accepts ISO 3166-1 alpha-2 codes (e.g., `"US"`, `"DE"`, `"FR"`)
1566
+ or alpha-3 codes (e.g., `"USA"`, `"DEU"`, `"FRA"`). Default is `"US"`.
1567
+
1568
+ Returns
1569
+ -------
1570
+ DataFrame or dict
1571
+ Generated data in the requested format.
1572
+
1573
+ Raises
1574
+ ------
1575
+ ValueError
1576
+ If the schema has no columns or if constraints cannot be satisfied.
1577
+ ImportError
1578
+ If required optional dependencies are not installed.
1579
+
1580
+ Supported Countries
1581
+ -------------------
1582
+ The `country=` parameter controls the country used for generating realistic data with
1583
+ presets (e.g., `preset="email"`, `preset="address"`). This affects location-specific
1584
+ formats like addresses, phone numbers, and postal codes. Currently, **50 countries** are
1585
+ supported with full locale data:
1586
+
1587
+ **Europe (32 countries):** Austria (`"AT"`), Belgium (`"BE"`), Bulgaria (`"BG"`),
1588
+ Croatia (`"HR"`), Cyprus (`"CY"`), Czech Republic (`"CZ"`), Denmark (`"DK"`),
1589
+ Estonia (`"EE"`), Finland (`"FI"`), France (`"FR"`), Germany (`"DE"`), Greece (`"GR"`),
1590
+ Hungary (`"HU"`), Iceland (`"IS"`), Ireland (`"IE"`), Italy (`"IT"`), Latvia (`"LV"`),
1591
+ Lithuania (`"LT"`), Luxembourg (`"LU"`), Malta (`"MT"`), Netherlands (`"NL"`),
1592
+ Norway (`"NO"`), Poland (`"PL"`), Portugal (`"PT"`), Romania (`"RO"`), Russia (`"RU"`),
1593
+ Slovakia (`"SK"`), Slovenia (`"SI"`), Spain (`"ES"`), Sweden (`"SE"`),
1594
+ Switzerland (`"CH"`), United Kingdom (`"GB"`)
1595
+
1596
+ **Americas (7 countries):** Argentina (`"AR"`), Brazil (`"BR"`), Canada (`"CA"`),
1597
+ Chile (`"CL"`), Colombia (`"CO"`), Mexico (`"MX"`), United States (`"US"`)
1598
+
1599
+ **Asia-Pacific (10 countries):** Australia (`"AU"`), China (`"CN"`), Hong Kong (`"HK"`),
1600
+ India (`"IN"`), Indonesia (`"ID"`), Japan (`"JP"`), New Zealand (`"NZ"`),
1601
+ Philippines (`"PH"`), South Korea (`"KR"`), Taiwan (`"TW"`)
1602
+
1603
+ **Middle East (1 country):** Turkey (`"TR"`)
1604
+
1605
+ Examples
1606
+ --------
1607
+ Generate test data from a schema with field constraints:
1608
+
1609
+ ```{python}
1610
+ import pointblank as pb
1611
+
1612
+ schema = pb.Schema(
1613
+ user_id=pb.int_field(min_val=1, unique=True),
1614
+ email=pb.string_field(preset="email"),
1615
+ age=pb.int_field(min_val=18, max_val=100),
1616
+ status=pb.string_field(allowed=["active", "pending", "inactive"]),
1617
+ )
1618
+
1619
+ # Generate 100 rows of test data
1620
+ pb.preview(pb.generate_dataset(schema, n=100, seed=23))
1621
+ ```
1622
+
1623
+ Generate data from a simple dtype-only schema as a Pandas DataFrame:
1624
+
1625
+ ```{python}
1626
+ schema = pb.Schema(name="String", age="Int64", active="Boolean")
1627
+ pb.preview(pb.generate_dataset(schema, n=50, seed=23, output="pandas"))
1628
+ ```
1629
+
1630
+ Generate data with German addresses by using `country="DE"`:
1631
+
1632
+ ```{python}
1633
+ schema = pb.Schema(
1634
+ name=pb.string_field(preset="name"),
1635
+ address=pb.string_field(preset="address"),
1636
+ city=pb.string_field(preset="city"),
1637
+ )
1638
+ pb.preview(pb.generate_dataset(schema, n=20, seed=23, country="DE"))
1639
+ ```
1640
+ """
1641
+ return schema.generate(n=n, seed=seed, output=output, country=country)
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: pointblank
3
- Version: 0.19.0
3
+ Version: 0.20.0
4
4
  Summary: Find out if your data is what you think it is.
5
5
  Author-email: Richard Iannone <riannone@me.com>
6
6
  License: MIT License
@@ -451,6 +451,50 @@ Click the following headings to see some video demonstrations of the CLI:
451
451
 
452
452
  </details>
453
453
 
454
+ ## Generate Realistic Test Data
455
+
456
+ Need test data for your validation workflows? The `generate_dataset()` function creates realistic, locale-aware synthetic data based on schema definitions. It's very useful for developing pipelines without production data, running CI/CD tests with reproducible scenarios, or prototyping workflows before production data is available.
457
+
458
+ ```python
459
+ import pointblank as pb
460
+
461
+ # Define a schema with field constraints
462
+ schema = pb.Schema(
463
+ user_id=pb.int_field(min_val=1, unique=True),
464
+ name=pb.string_field(preset="name"),
465
+ email=pb.string_field(preset="email"),
466
+ age=pb.int_field(min_val=18, max_val=100),
467
+ status=pb.string_field(allowed=["active", "pending", "inactive"]),
468
+ )
469
+
470
+ # Generate 10 rows of realistic test data
471
+ data = pb.generate_dataset(schema, n=10, seed=23)
472
+
473
+ data
474
+ ```
475
+
476
+ | user_id | name | email | age | status |
477
+ |---------------------|------------------|----------------------------|-----|----------|
478
+ | 7188536481533917197 | Vivienne Rios | vrios27@hotmail.com | 55 | pending |
479
+ | 2674009078779859984 | William Schaefer | wschaefer28@yandex.com | 28 | active |
480
+ | 7652102777077138151 | Lily Hansen | lily779@aol.com | 20 | active |
481
+ | 157503859921753049 | Shirley Mays | shirley_mays@protonmail.com| 93 | inactive |
482
+ | 2829213282471975080 | Sean Dawson | sean_dawson@hotmail.com | 57 | pending |
483
+ | 3497364383162086858 | Zachary Marsh | zmarsh23@zoho.com | 72 | pending |
484
+ | 3302703640991750415 | Gemma Gonzalez | gemmagonzalez@yahoo.com | 66 | pending |
485
+ | 6695746877064448147 | Brian Haley | brian437@yandex.com | 85 | inactive |
486
+ | 2466163118311913924 | Nora Hernandez | norahernandez@aol.com | 63 | pending |
487
+ | 129827878195925732 | Diana Novak | diana922@protonmail.com | 34 | active |
488
+
489
+ The generator supports sophisticated data generation with these capabilities:
490
+
491
+ - **Realistic data with presets**: Use built-in presets like `"name"`, `"email"`, `"address"`, `"phone"`, etc.
492
+ - **50+ country support**: Generate locale-specific data (e.g., `country="DE"` for German addresses)
493
+ - **Field constraints**: Control ranges, patterns, uniqueness, and allowed values
494
+ - **Multiple output formats**: Returns Polars DataFrames by default, but also supports Pandas (`output="pandas"`) or dictionaries (`output="dict"`)
495
+
496
+ This makes it easy to generate test data that matches your validation rules, helping you develop and test data quality workflows without relying on real data.
497
+
454
498
  ## Features That Set Pointblank Apart
455
499
 
456
500
  - **Complete validation workflow**: From data access to validation to reporting in a single pipeline