GLDF 0.9.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,394 @@
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 1,
6
+ "id": "955d4f40",
7
+ "metadata": {},
8
+ "outputs": [],
9
+ "source": [
10
+ "import GLDF"
11
+ ]
12
+ },
13
+ {
14
+ "cell_type": "markdown",
15
+ "id": "33f5fb3e",
16
+ "metadata": {},
17
+ "source": [
18
+ "# Tutorial 02: More Detailed Configurations\n",
19
+ "\n",
20
+ "This tutorial presents again two simple examples (similar to those in the first tutorial):\n",
21
+ "1. Application to time series data with temporally persistent regimes, this time configured for latents\n",
22
+ "2. Application to gridded data with spatially localized regimes, this time using PC-stable in place of FCI"
23
+ ]
24
+ },
25
+ {
26
+ "cell_type": "markdown",
27
+ "id": "b32274c9",
28
+ "metadata": {},
29
+ "source": [
30
+ "## 1. Time-Series and Latents\n",
31
+ "\n",
32
+ "Time-series with latents by using tigramite's LPCMCI implementation works out of the box, using 'run_hccd_temporal_regimes' with 'allow_latent_confounding=True' option.\n",
33
+ "This part of the tutorial explains, how the built-in configurations can be customized, on the example of reimplementing this feature.\n",
34
+ "\n",
35
+ "The standard-configurations are data-classes, and we can inspect parameters as follows\n",
36
+ "(these simple parameters can be modified simply by obtaining a configuration object, e.g. via configure_hccd_temporal_regimes, overwriting the value, and then calling run(data)\n",
37
+ "on the configuration object):"
38
+ ]
39
+ },
40
+ {
41
+ "cell_type": "code",
42
+ "execution_count": 2,
43
+ "id": "e09a004f",
44
+ "metadata": {},
45
+ "outputs": [
46
+ {
47
+ "name": "stdout",
48
+ "output_type": "stream",
49
+ "text": [
50
+ "{'is_timeseries': True, 'alpha': 0.01, 'min_regime_fraction': 0.15, 'indicator_resolution_granularity': 100, 'regimes_are_large': True, 'tau_max': 1, 'alpha_pc1': 0.1, '_data': None}\n"
51
+ ]
52
+ }
53
+ ],
54
+ "source": [
55
+ "\n",
56
+ "from dataclasses import asdict\n",
57
+ "print(asdict(GLDF.frontend.configure_hccd_temporal_regimes()))"
58
+ ]
59
+ },
60
+ {
61
+ "cell_type": "markdown",
62
+ "id": "208b3652",
63
+ "metadata": {},
64
+ "source": [
65
+ "The backend has a rather large number of modules all of which can operate mostly independently.\n",
66
+ "If changing only some of them, we could simiply copy-and-paste the code which instantiates and composes all of them, and switch out those we want to change.\n",
67
+ "However, this would not be well-maintainable: If a future version improves or restructures any part of the frontend, or if we want to allow the combination\n",
68
+ "of different modifications at different modules of the frontend, this idea very quickly reaches its limitations.\n",
69
+ "\n",
70
+ "So instead, also the frontend is made modular. It is exposed in form of \"configuration-class\" which implement getters for (many) different (sub-)modules in\n",
71
+ "in the backend. These (sub-)modules are constructed and composed with other (sub-)modules obtained in turn through other getters.\n",
72
+ "This means, any (sub-)modul can simply be swapped out by replacing its getter! All other (sub-)modules connecting to this component will\n",
73
+ "automatically get \"rewired\" as they call only the (new) getter to access this component.\n",
74
+ "\n",
75
+ "Put simply, if we want for example to replace PCMCI+ by LPCMCI, we can simply start from the \"default\" timer-series configuration,\n",
76
+ "and replace the \"universal CD\" getter.\n",
77
+ "Indeed in this case (see the documentation of bridges.tigramite.alg_lpcmci), we also have to switch out the\n",
78
+ "HCCD-controller. But this can be achieved just as easily."
79
+ ]
80
+ },
81
+ {
82
+ "cell_type": "code",
83
+ "execution_count": 5,
84
+ "id": "299e7825",
85
+ "metadata": {},
86
+ "outputs": [],
87
+ "source": [
88
+ "class ConfigureHCCD_LPCMCI(GLDF.frontend.ConfigureHCCD):\n",
89
+ " def __init__(self, regimes_are_large: bool=True, alpha: float=0.01, alpha_pc1: float=0.1):\n",
90
+ " ts_config = GLDF.frontend.configure_hccd_temporal_regimes(regimes_are_large=regimes_are_large, alpha=alpha, alpha_pc1=alpha_pc1)\n",
91
+ " from dataclasses import asdict\n",
92
+ " super().__init__(**asdict(ts_config))\n",
93
+ " \n",
94
+ "\n",
95
+ " def get_universal_cd(self) -> GLDF.hccd.abstract_cd_t:\n",
96
+ " return GLDF.bridges.tigramite.alg_lpcmci(data_format=self.get_data_manager())\n",
97
+ "\n",
98
+ " def get_controller(self) ->GLDF.hccd.Controller:\n",
99
+ " return GLDF.hccd.ControllerTimeseriesLPCMCI(universal_cd=self.get_universal_cd(), testing_backend=self.get_transitionable_backend(), state_space_construction=self.get_state_space_construction())"
100
+ ]
101
+ },
102
+ {
103
+ "cell_type": "markdown",
104
+ "id": "7001797c",
105
+ "metadata": {},
106
+ "source": [
107
+ "That's it already. After generating data, we can simply create an instance of this configuration and run it on the data."
108
+ ]
109
+ },
110
+ {
111
+ "cell_type": "code",
112
+ "execution_count": 8,
113
+ "id": "50e8eb43",
114
+ "metadata": {},
115
+ "outputs": [],
116
+ "source": [
117
+ "import numpy as np\n",
118
+ "\n",
119
+ "N = 1000\n",
120
+ "\n",
121
+ "R = np.zeros(N, dtype=bool)\n",
122
+ "R[int(N/2):] = True\n",
123
+ "\n",
124
+ "rng = np.random.default_rng()\n",
125
+ "X_noise = rng.standard_normal(N)\n",
126
+ "Y_noise = rng.standard_normal(N)\n",
127
+ "Z_noise = rng.standard_normal(N)\n",
128
+ "L_noise = rng.standard_normal(N)\n",
129
+ "W_noise = rng.standard_normal(N)\n",
130
+ "\n",
131
+ "X = np.empty_like(X_noise)\n",
132
+ "Y = np.empty_like(Y_noise)\n",
133
+ "Z = np.empty_like(Z_noise)\n",
134
+ "L = np.empty_like(L_noise)\n",
135
+ "W = np.empty_like(W_noise)\n",
136
+ "\n",
137
+ "def lag_one_or_zero(values, t):\n",
138
+ " return values[t-1] if t > 0 else 0.0\n",
139
+ "\n",
140
+ "for t in range(N):\n",
141
+ " # latent\n",
142
+ " L[t] = L_noise[t] # + 0.25 * lag_one_or_zero(L, t)\n",
143
+ " # observables:\n",
144
+ " W[t] = W_noise[t] + 0.15 * lag_one_or_zero(W, t)\n",
145
+ " X[t] = X_noise[t] + 0.2 * lag_one_or_zero(X, t) - 0.7 * lag_one_or_zero(W, t) + 1.2 * L[t]\n",
146
+ " Z[t] = Z_noise[t]\n",
147
+ " Y[t] = Y_noise[t] + R[t] * lag_one_or_zero(X, t) + Z[t] + 0.8 * lag_one_or_zero(L, t)\n",
148
+ "\n",
149
+ "data = np.array([X,Y,Z,W]).T\n",
150
+ "var_names = [\"X\", \"Y\", \"Z\", \"W\"]"
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "markdown",
155
+ "id": "a10823d6",
156
+ "metadata": {},
157
+ "source": [
158
+ "Regime-detection with latents in time-series is still experimental, in particular it may have low recall on regimes.\n",
159
+ "For example commenting in the auto-lag on L in the example above leads to problems. The deeper problem is not necessarily\n",
160
+ "related to auto-lagged confounders, but they provide one example why time-series causal-discovery with latents\n",
161
+ "in general is a very complex and difficult problem."
162
+ ]
163
+ },
164
+ {
165
+ "cell_type": "code",
166
+ "execution_count": 9,
167
+ "id": "fda65558",
168
+ "metadata": {},
169
+ "outputs": [],
170
+ "source": [
171
+ "import numpy as np\n",
172
+ "\n",
173
+ "N = 1000\n",
174
+ "\n",
175
+ "R = np.zeros(N, dtype=bool)\n",
176
+ "R[int(N/2):] = True\n",
177
+ "\n",
178
+ "rng = np.random.default_rng()\n",
179
+ "L = rng.standard_normal(N)\n",
180
+ "W = rng.standard_normal(N)\n",
181
+ "X = rng.standard_normal(N) + L + W\n",
182
+ "Z = rng.standard_normal(N)\n",
183
+ "Y = rng.standard_normal(N) + L + R * Z\n",
184
+ "\n",
185
+ "data = np.array([X,Y,Z,W]).T\n",
186
+ "var_names = [\"X\", \"Y\", \"Z\", \"W\"]"
187
+ ]
188
+ },
189
+ {
190
+ "cell_type": "code",
191
+ "execution_count": 10,
192
+ "id": "3c3876ca",
193
+ "metadata": {},
194
+ "outputs": [],
195
+ "source": [
196
+ "config = ConfigureHCCD_LPCMCI()\n",
197
+ "result = config.run(data)"
198
+ ]
199
+ },
200
+ {
201
+ "cell_type": "code",
202
+ "execution_count": 11,
203
+ "id": "647d5348",
204
+ "metadata": {},
205
+ "outputs": [
206
+ {
207
+ "data": {
208
+ "image/png": "",
209
+ "text/plain": [
210
+ "<Figure size 400x400 with 1 Axes>"
211
+ ]
212
+ },
213
+ "metadata": {},
214
+ "output_type": "display_data"
215
+ },
216
+ {
217
+ "data": {
218
+ "image/png": "",
219
+ "text/plain": [
220
+ "<Figure size 640x480 with 1 Axes>"
221
+ ]
222
+ },
223
+ "metadata": {},
224
+ "output_type": "display_data"
225
+ }
226
+ ],
227
+ "source": [
228
+ "import matplotlib.pyplot as plt\n",
229
+ "result.var_names = var_names\n",
230
+ "result.plot_labeled_union_graph()\n",
231
+ "plt.show()\n",
232
+ "for mi in result.model_indicators():\n",
233
+ " mi.plot_resolution()\n",
234
+ " plt.show()"
235
+ ]
236
+ },
237
+ {
238
+ "cell_type": "markdown",
239
+ "id": "032bb099",
240
+ "metadata": {},
241
+ "source": [
242
+ "## 2. Spatial Regimes with PC-stable\n",
243
+ "\n",
244
+ "In the first tutorial, we ran an example with spatial regimes and FCI (allowing for latent confounding).\n",
245
+ "Here we re-configure the setup to run with PC-stable (no latent confounding).\n",
246
+ "\n",
247
+ "As in the first part, we copy a default configuration (in this case configure_hccd_spatial_regimes) and replace the\n",
248
+ "universal-cd module."
249
+ ]
250
+ },
251
+ {
252
+ "cell_type": "code",
253
+ "execution_count": 12,
254
+ "id": "1bd2f284",
255
+ "metadata": {},
256
+ "outputs": [],
257
+ "source": [
258
+ "class ConfigureHCCD_PCstable(GLDF.frontend.ConfigureHCCD):\n",
259
+ " def __init__(self, regimes_are_large: bool=True, alpha: float=0.01):\n",
260
+ " std_config = GLDF.frontend.configure_hccd_spatial_regimes(regimes_are_large=regimes_are_large, alpha=alpha)\n",
261
+ " from dataclasses import asdict\n",
262
+ " super().__init__(**asdict(std_config)) \n",
263
+ "\n",
264
+ " def get_universal_cd(self) -> GLDF.hccd.abstract_cd_t:\n",
265
+ " return GLDF.bridges.causal_learn.alg_pc_stable(data_format=self.get_data_manager())"
266
+ ]
267
+ },
268
+ {
269
+ "cell_type": "markdown",
270
+ "id": "0678b4c1",
271
+ "metadata": {},
272
+ "source": [
273
+ "We use the same data-generation as in the first tutorial:"
274
+ ]
275
+ },
276
+ {
277
+ "cell_type": "code",
278
+ "execution_count": 13,
279
+ "id": "4b94533c",
280
+ "metadata": {},
281
+ "outputs": [
282
+ {
283
+ "name": "stdout",
284
+ "output_type": "stream",
285
+ "text": [
286
+ "(100, 100, 4)\n"
287
+ ]
288
+ }
289
+ ],
290
+ "source": [
291
+ "import numpy as np\n",
292
+ "\n",
293
+ "def get_data(seed=None):\n",
294
+ " data_size = (100, 100)\n",
295
+ " A = np.full(data_size, False)\n",
296
+ " A[:50,:] = True\n",
297
+ " B = np.full(data_size, False)\n",
298
+ " B[:,:50] = True\n",
299
+ "\n",
300
+ " rng = np.random.default_rng(seed=seed)\n",
301
+ "\n",
302
+ " X = rng.standard_normal(data_size)\n",
303
+ " Y = rng.standard_normal(data_size)\n",
304
+ " L = rng.standard_normal(data_size) # latent\n",
305
+ " Z = rng.standard_normal(data_size) + A * X + L\n",
306
+ " W = rng.standard_normal(data_size) + Y + B * L\n",
307
+ "\n",
308
+ " return np.array([X,Y,Z,W]).transpose([1,2,0])\n",
309
+ "\n",
310
+ "var_names = [\"X\", \"Y\", \"Z\", \"W\"]\n",
311
+ "data = get_data(seed=17062025)\n",
312
+ "print(data.shape)"
313
+ ]
314
+ },
315
+ {
316
+ "cell_type": "markdown",
317
+ "id": "07765b97",
318
+ "metadata": {},
319
+ "source": [
320
+ "**Note:** This example *does* contain latent confounding. Here this leads to orientation-conflicts (x-markings in graph). In general there is of course no guarantee that inconsistencies with assumptions will be found."
321
+ ]
322
+ },
323
+ {
324
+ "cell_type": "code",
325
+ "execution_count": 15,
326
+ "id": "9ca17510",
327
+ "metadata": {},
328
+ "outputs": [
329
+ {
330
+ "data": {
331
+ "image/png": "",
332
+ "text/plain": [
333
+ "<Figure size 400x400 with 1 Axes>"
334
+ ]
335
+ },
336
+ "metadata": {},
337
+ "output_type": "display_data"
338
+ },
339
+ {
340
+ "data": {
341
+ "image/png": "",
342
+ "text/plain": [
343
+ "<Figure size 640x480 with 1 Axes>"
344
+ ]
345
+ },
346
+ "metadata": {},
347
+ "output_type": "display_data"
348
+ },
349
+ {
350
+ "data": {
351
+ "image/png": "",
352
+ "text/plain": [
353
+ "<Figure size 640x480 with 1 Axes>"
354
+ ]
355
+ },
356
+ "metadata": {},
357
+ "output_type": "display_data"
358
+ }
359
+ ],
360
+ "source": [
361
+ "config = ConfigureHCCD_PCstable()\n",
362
+ "result = config.run(data)\n",
363
+ "\n",
364
+ "result.var_names = var_names\n",
365
+ "result.plot_labeled_union_graph()\n",
366
+ "plt.show()\n",
367
+ "for mi in result.model_indicators():\n",
368
+ " mi.plot_resolution()\n",
369
+ " plt.show()"
370
+ ]
371
+ }
372
+ ],
373
+ "metadata": {
374
+ "kernelspec": {
375
+ "display_name": ".venv",
376
+ "language": "python",
377
+ "name": "python3"
378
+ },
379
+ "language_info": {
380
+ "codemirror_mode": {
381
+ "name": "ipython",
382
+ "version": 3
383
+ },
384
+ "file_extension": ".py",
385
+ "mimetype": "text/x-python",
386
+ "name": "python",
387
+ "nbconvert_exporter": "python",
388
+ "pygments_lexer": "ipython3",
389
+ "version": "3.13.5"
390
+ }
391
+ },
392
+ "nbformat": 4,
393
+ "nbformat_minor": 5
394
+ }