@tidyjs/tidy 2.6.0 → 2.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/genai-docs/api-core.md +357 -0
- package/genai-docs/api-grouping.md +400 -0
- package/genai-docs/api-joins.md +118 -0
- package/genai-docs/api-other.md +238 -0
- package/genai-docs/api-pivot.md +112 -0
- package/genai-docs/api-selectors.md +159 -0
- package/genai-docs/api-sequences.md +127 -0
- package/genai-docs/api-slice.md +137 -0
- package/genai-docs/api-summarize.md +528 -0
- package/genai-docs/api-vector.md +239 -0
- package/genai-docs/gotchas.md +193 -0
- package/genai-docs/index.md +44 -0
- package/genai-docs/mental-model.md +270 -0
- package/genai-docs/patterns.md +384 -0
- package/genai-docs/quick-reference.md +125 -0
- package/package.json +3 -2
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# Quick Reference
|
|
2
|
+
|
|
3
|
+
Map common data tasks to the right tidyjs function.
|
|
4
|
+
|
|
5
|
+
```js
|
|
6
|
+
import { tidy, filter, mutate, mutateWithSummary, arrange, asc, desc,
|
|
7
|
+
select, distinct, rename, groupBy, summarize, count, tally,
|
|
8
|
+
sum, mean, median, min, max, n, nDistinct, first, last,
|
|
9
|
+
cumsum, lag, lead, roll, rowNumber,
|
|
10
|
+
innerJoin, leftJoin, fullJoin,
|
|
11
|
+
pivotWider, pivotLonger,
|
|
12
|
+
sliceHead, sliceTail, sliceMin, sliceMax, sliceSample,
|
|
13
|
+
complete, expand, fill, replaceNully, addRows, when, total,
|
|
14
|
+
everything, startsWith, endsWith, contains, matches,
|
|
15
|
+
rate, TMath } from '@tidyjs/tidy';
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## Row Operations
|
|
19
|
+
|
|
20
|
+
| I want to... | Use |
|
|
21
|
+
|---|---|
|
|
22
|
+
| Filter rows by condition | `filter((d) => d.value > 10)` |
|
|
23
|
+
| Keep only unique rows | `distinct(['col1', 'col2'])` |
|
|
24
|
+
| Sort rows ascending | `arrange(asc('name'))` or `arrange((d) => d.name)` |
|
|
25
|
+
| Sort rows descending | `arrange(desc('value'))` |
|
|
26
|
+
| Sort by multiple columns | `arrange(asc('category'), desc('value'))` |
|
|
27
|
+
| Take first N rows | `sliceHead(5)` |
|
|
28
|
+
| Take last N rows | `sliceTail(5)` |
|
|
29
|
+
| Take rows with smallest values | `sliceMin(3, 'score')` |
|
|
30
|
+
| Take rows with largest values | `sliceMax(3, 'score')` |
|
|
31
|
+
| Take random sample | `sliceSample(10)` |
|
|
32
|
+
| Take a range of rows | `slice(2, 5)` |
|
|
33
|
+
| Add rows to the data | `addRows([{ name: 'new', value: 0 }])` |
|
|
34
|
+
|
|
35
|
+
## Column Operations
|
|
36
|
+
|
|
37
|
+
| I want to... | Use |
|
|
38
|
+
|---|---|
|
|
39
|
+
| Add or modify a column (per item) | `mutate({ newCol: (d) => d.a + d.b })` |
|
|
40
|
+
| Add a column using cross-item calculation | `mutateWithSummary({ total: sum('value') })` |
|
|
41
|
+
| Keep only certain columns | `select(['name', 'value'])` |
|
|
42
|
+
| Drop specific columns | `select(['-password', '-secret'])` |
|
|
43
|
+
| Select columns by prefix | `select([startsWith('revenue_')])` |
|
|
44
|
+
| Select all columns except... | `select([everything(), '-internal'])` |
|
|
45
|
+
| Rename columns | `rename({ oldName: 'newName' })` |
|
|
46
|
+
| Keep only selected columns + transform | `transmute({ newCol: (d) => d.a * 2 })` |
|
|
47
|
+
| Replace null/undefined values | `replaceNully({ score: 0, name: 'unknown' })` |
|
|
48
|
+
| Set a constant value on all items | `mutate({ status: 'active' })` |
|
|
49
|
+
|
|
50
|
+
## Aggregation
|
|
51
|
+
|
|
52
|
+
| I want to... | Use |
|
|
53
|
+
|---|---|
|
|
54
|
+
| Sum a column | `summarize({ total: sum('value') })` |
|
|
55
|
+
| Average a column | `summarize({ avg: mean('score') })` |
|
|
56
|
+
| Find median | `summarize({ med: median('value') })` |
|
|
57
|
+
| Find min/max | `summarize({ lo: min('val'), hi: max('val') })` |
|
|
58
|
+
| Count rows | `summarize({ count: n() })` |
|
|
59
|
+
| Count distinct values | `summarize({ unique: nDistinct('category') })` |
|
|
60
|
+
| Get first/last value | `summarize({ start: first('date'), end: last('date') })` |
|
|
61
|
+
| Standard deviation | `summarize({ sd: deviation('value') })` |
|
|
62
|
+
| Variance | `summarize({ v: variance('value') })` |
|
|
63
|
+
| Count shorthand | `count('category')` |
|
|
64
|
+
| Tally rows | `tally()` |
|
|
65
|
+
| Append a total row (keep originals) | `total({ value: sum('value') })` |
|
|
66
|
+
|
|
67
|
+
## Grouping
|
|
68
|
+
|
|
69
|
+
| I want to... | Use |
|
|
70
|
+
|---|---|
|
|
71
|
+
| Group and aggregate | `groupBy('key', [summarize({ total: sum('val') })])` |
|
|
72
|
+
| Group by multiple keys | `groupBy(['cat', 'subcat'], [summarize(...)])` |
|
|
73
|
+
| Group by computed key | `groupBy((d) => d.date.getFullYear(), [summarize(...)])` |
|
|
74
|
+
| Get result as plain object | `groupBy('key', [ops], groupBy.object())` |
|
|
75
|
+
| Get result as entries array | `groupBy('key', [ops], groupBy.entries())` |
|
|
76
|
+
| Get result as Map | `groupBy('key', [ops], groupBy.map())` |
|
|
77
|
+
| Get single item per group (after summarize) | `groupBy('key', [summarize(...)], groupBy.object({ single: true }))` |
|
|
78
|
+
|
|
79
|
+
## Cross-Item Column Operations
|
|
80
|
+
|
|
81
|
+
| I want to... | Use |
|
|
82
|
+
|---|---|
|
|
83
|
+
| Running total (cumulative sum) | `mutateWithSummary({ cum: cumsum('value') })` |
|
|
84
|
+
| Previous row's value | `mutateWithSummary({ prev: lag('value') })` |
|
|
85
|
+
| Next row's value | `mutateWithSummary({ next: lead('value') })` |
|
|
86
|
+
| Rolling average (window of N) | `mutateWithSummary({ avg: roll(3, mean('value')) })` |
|
|
87
|
+
| Row number / rank | `mutateWithSummary({ rank: rowNumber() })` |
|
|
88
|
+
| Percentage of total | `mutateWithSummary({ pct: (items) => items.map(d => d.value / sum('value')(items)) })` |
|
|
89
|
+
|
|
90
|
+
## Reshaping
|
|
91
|
+
|
|
92
|
+
| I want to... | Use |
|
|
93
|
+
|---|---|
|
|
94
|
+
| Long to wide (pivot columns out) | `pivotWider({ namesFrom: 'key', valuesFrom: 'value' })` |
|
|
95
|
+
| Wide to long (unpivot) | `pivotLonger({ cols: ['col1', 'col2'], namesTo: 'key', valuesTo: 'value' })` |
|
|
96
|
+
| Fill missing combinations | `complete({ col1: ['a', 'b'], col2: [1, 2] })` |
|
|
97
|
+
| Generate all combinations | `expand({ col1: ['a', 'b'], col2: [1, 2] })` |
|
|
98
|
+
| Fill nulls forward (down) | `fill('column')` |
|
|
99
|
+
|
|
100
|
+
## Joins
|
|
101
|
+
|
|
102
|
+
| I want to... | Use |
|
|
103
|
+
|---|---|
|
|
104
|
+
| Inner join (matching rows only) | `innerJoin(otherData, { by: 'id' })` |
|
|
105
|
+
| Left join (keep all left rows) | `leftJoin(otherData, { by: 'id' })` |
|
|
106
|
+
| Full outer join | `fullJoin(otherData, { by: 'id' })` |
|
|
107
|
+
| Join on different column names | `leftJoin(other, { by: { myId: 'theirId' } })` |
|
|
108
|
+
|
|
109
|
+
## Conditional & Utility
|
|
110
|
+
|
|
111
|
+
| I want to... | Use |
|
|
112
|
+
|---|---|
|
|
113
|
+
| Conditionally apply a transform | `when(condition, [filter(...)])` |
|
|
114
|
+
| Apply a custom function to each item | `map((d) => ({ ...d, upper: d.name.toUpperCase() }))` |
|
|
115
|
+
| Log intermediate pipeline state | `debug()` or `debug('label')` |
|
|
116
|
+
| Calculate a rate per item | `mutate({ r: rate('num', 'denom') })` |
|
|
117
|
+
| Simple math rate | `TMath.rate(numerator, denominator)` |
|
|
118
|
+
|
|
119
|
+
## Aliases
|
|
120
|
+
|
|
121
|
+
| Canonical | Alias |
|
|
122
|
+
|---|---|
|
|
123
|
+
| `arrange` | `sort` |
|
|
124
|
+
| `select` | `pick` |
|
|
125
|
+
| `addRows` | `addItems` |
|
package/package.json
CHANGED
|
@@ -1,11 +1,12 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@tidyjs/tidy",
|
|
3
|
-
"version": "2.6.
|
|
3
|
+
"version": "2.6.1",
|
|
4
4
|
"description": "Tidy up your data with JavaScript, inspired by dplyr and the tidyverse",
|
|
5
5
|
"main": "dist/lib/index.js",
|
|
6
6
|
"module": "dist/es/index.js",
|
|
7
7
|
"files": [
|
|
8
|
-
"dist"
|
|
8
|
+
"dist",
|
|
9
|
+
"genai-docs"
|
|
9
10
|
],
|
|
10
11
|
"types": "dist/tidy.d.ts",
|
|
11
12
|
"typings": "dist/tidy.d.ts",
|