sqlite-chronicle 0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,65 @@
1
+ Metadata-Version: 2.1
2
+ Name: sqlite-chronicle
3
+ Version: 0.1
4
+ Summary: Use triggers to maintain a chronicle table of updated/deleted timestamps in SQLite
5
+ Author: Simon Willison
6
+ License: Apache-2.0
7
+ Project-URL: Homepage, https://github.com/simonw/sqlite-chronicle
8
+ Project-URL: Changelog, https://github.com/simonw/sqlite-chronicle/releases
9
+ Project-URL: Issues, https://github.com/simonw/sqlite-chronicle/issues
10
+ Project-URL: CI, https://github.com/simonw/sqlite-chronicle/actions
11
+ Classifier: License :: OSI Approved :: Apache Software License
12
+ Description-Content-Type: text/markdown
13
+ Provides-Extra: test
14
+ Requires-Dist: pytest; extra == "test"
15
+ Requires-Dist: sqlite-utils; extra == "test"
16
+
17
+ # sqlite-chronicle
18
+
19
+ [![PyPI](https://img.shields.io/pypi/v/sqlite-chronicle.svg)](https://pypi.org/project/sqlite-chronicle/)
20
+ [![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-chronicle?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-chronicle/releases)
21
+ [![Tests](https://github.com/simonw/sqlite-chronicle/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-chronicle/actions?query=workflow%3ATest)
22
+ [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-chronicle/blob/main/LICENSE)
23
+
24
+ Use triggers to track when rows in a SQLite table were updated or deleted
25
+
26
+ ## Installation
27
+
28
+ ```bash
29
+ pip install sqlite-chronicle
30
+ ```
31
+
32
+ ## enable_chronicle(conn, table_name)
33
+
34
+ This module provides a single function: `sqlite_chronicle.enable_chronicle(conn, table_name)`, which does the following:
35
+
36
+ 1. Checks if a `_chronicle_{table_name}` table exists alreday. If so, it does nothing. Otherwise...
37
+ 2. Creates that table, with the same primary key columns as the original table plus integer columns `updated_ms` and `deleted`
38
+ 3. Creates a new row in the chronicle table corresponding to every row in the original table, setting `updated_ms` to the current timestamp in milliseconds
39
+ 4. Sets up three triggers on the table:
40
+ - An after insert trigger, which creates a new row in the chronicle table and sets `updated_ms` to the current time
41
+ - An after update trigger, which updates that timestamp and also updates any primary keys if they have changed (likely extremely rare)
42
+ - An after delete trigger, which updates that timestamp and places a `1` in the `deleted` column
43
+
44
+ The function will raise a `sqlite_chronicle.ChronicleError` exception if the table does not have a single or compound primary key.
45
+
46
+ The end result is a chronicle table that looks something like this:
47
+
48
+ | id | updated_ms | deleted |
49
+ |-----|---------------|---------|
50
+ | 47 | 1694408890954 | 0 |
51
+ | 48 | 1694408874863 | 1 |
52
+ | 1 | 1694408825192 | 0 |
53
+ | 2 | 1694408825192 | 0 |
54
+ | 3 | 1694408825192 | 0 |
55
+
56
+ ## Applications
57
+
58
+ Chronicle tables can be used to efficiently answer the question "what rows have been inserted, updated or deleted since I last checked".
59
+
60
+ This has numerous potential applications, including:
61
+
62
+ - Synchronization and replication: other databases can "subscribe" to tables, keeping track of when they last refreshed their copy and requesting just rows that changed since the last time - and deleting rows that have been marked as deleted.
63
+ - Indexing: if you need to update an Elasticsearch index or a vector database embeddings index or similar you can run against just the records that changed since your last run - see also [The denormalized query engine design pattern](https://2017.djangocon.us/talks/the-denormalized-query-engine-design-pattern/)
64
+ - Enrichments: [datasette-enrichments](https://github.com/datasette/datasette-enrichments) needs to to persist something that says "every address column should be geocoded" - then have an enrichment that runs every X seconds and looks for newly inserted or updated rows and enriches just those.
65
+ - Showing people what has changed since their last visit - "52 rows have been updated and 16 deleted since yesterday" kind of thing.
@@ -0,0 +1,49 @@
1
+ # sqlite-chronicle
2
+
3
+ [![PyPI](https://img.shields.io/pypi/v/sqlite-chronicle.svg)](https://pypi.org/project/sqlite-chronicle/)
4
+ [![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-chronicle?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-chronicle/releases)
5
+ [![Tests](https://github.com/simonw/sqlite-chronicle/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-chronicle/actions?query=workflow%3ATest)
6
+ [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-chronicle/blob/main/LICENSE)
7
+
8
+ Use triggers to track when rows in a SQLite table were updated or deleted
9
+
10
+ ## Installation
11
+
12
+ ```bash
13
+ pip install sqlite-chronicle
14
+ ```
15
+
16
+ ## enable_chronicle(conn, table_name)
17
+
18
+ This module provides a single function: `sqlite_chronicle.enable_chronicle(conn, table_name)`, which does the following:
19
+
20
+ 1. Checks if a `_chronicle_{table_name}` table exists alreday. If so, it does nothing. Otherwise...
21
+ 2. Creates that table, with the same primary key columns as the original table plus integer columns `updated_ms` and `deleted`
22
+ 3. Creates a new row in the chronicle table corresponding to every row in the original table, setting `updated_ms` to the current timestamp in milliseconds
23
+ 4. Sets up three triggers on the table:
24
+ - An after insert trigger, which creates a new row in the chronicle table and sets `updated_ms` to the current time
25
+ - An after update trigger, which updates that timestamp and also updates any primary keys if they have changed (likely extremely rare)
26
+ - An after delete trigger, which updates that timestamp and places a `1` in the `deleted` column
27
+
28
+ The function will raise a `sqlite_chronicle.ChronicleError` exception if the table does not have a single or compound primary key.
29
+
30
+ The end result is a chronicle table that looks something like this:
31
+
32
+ | id | updated_ms | deleted |
33
+ |-----|---------------|---------|
34
+ | 47 | 1694408890954 | 0 |
35
+ | 48 | 1694408874863 | 1 |
36
+ | 1 | 1694408825192 | 0 |
37
+ | 2 | 1694408825192 | 0 |
38
+ | 3 | 1694408825192 | 0 |
39
+
40
+ ## Applications
41
+
42
+ Chronicle tables can be used to efficiently answer the question "what rows have been inserted, updated or deleted since I last checked".
43
+
44
+ This has numerous potential applications, including:
45
+
46
+ - Synchronization and replication: other databases can "subscribe" to tables, keeping track of when they last refreshed their copy and requesting just rows that changed since the last time - and deleting rows that have been marked as deleted.
47
+ - Indexing: if you need to update an Elasticsearch index or a vector database embeddings index or similar you can run against just the records that changed since your last run - see also [The denormalized query engine design pattern](https://2017.djangocon.us/talks/the-denormalized-query-engine-design-pattern/)
48
+ - Enrichments: [datasette-enrichments](https://github.com/datasette/datasette-enrichments) needs to to persist something that says "every address column should be geocoded" - then have an enrichment that runs every X seconds and looks for newly inserted or updated rows and enriches just those.
49
+ - Showing people what has changed since their last visit - "52 rows have been updated and 16 deleted since yesterday" kind of thing.
@@ -0,0 +1,19 @@
1
+ [project]
2
+ name = "sqlite-chronicle"
3
+ version = "0.1"
4
+ description = "Use triggers to maintain a chronicle table of updated/deleted timestamps in SQLite"
5
+ readme = "README.md"
6
+ authors = [{name = "Simon Willison"}]
7
+ license = {text = "Apache-2.0"}
8
+ classifiers = [
9
+ "License :: OSI Approved :: Apache Software License"
10
+ ]
11
+
12
+ [project.urls]
13
+ Homepage = "https://github.com/simonw/sqlite-chronicle"
14
+ Changelog = "https://github.com/simonw/sqlite-chronicle/releases"
15
+ Issues = "https://github.com/simonw/sqlite-chronicle/issues"
16
+ CI = "https://github.com/simonw/sqlite-chronicle/actions"
17
+
18
+ [project.optional-dependencies]
19
+ test = ["pytest", "sqlite-utils"]
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,65 @@
1
+ Metadata-Version: 2.1
2
+ Name: sqlite-chronicle
3
+ Version: 0.1
4
+ Summary: Use triggers to maintain a chronicle table of updated/deleted timestamps in SQLite
5
+ Author: Simon Willison
6
+ License: Apache-2.0
7
+ Project-URL: Homepage, https://github.com/simonw/sqlite-chronicle
8
+ Project-URL: Changelog, https://github.com/simonw/sqlite-chronicle/releases
9
+ Project-URL: Issues, https://github.com/simonw/sqlite-chronicle/issues
10
+ Project-URL: CI, https://github.com/simonw/sqlite-chronicle/actions
11
+ Classifier: License :: OSI Approved :: Apache Software License
12
+ Description-Content-Type: text/markdown
13
+ Provides-Extra: test
14
+ Requires-Dist: pytest; extra == "test"
15
+ Requires-Dist: sqlite-utils; extra == "test"
16
+
17
+ # sqlite-chronicle
18
+
19
+ [![PyPI](https://img.shields.io/pypi/v/sqlite-chronicle.svg)](https://pypi.org/project/sqlite-chronicle/)
20
+ [![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-chronicle?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-chronicle/releases)
21
+ [![Tests](https://github.com/simonw/sqlite-chronicle/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-chronicle/actions?query=workflow%3ATest)
22
+ [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-chronicle/blob/main/LICENSE)
23
+
24
+ Use triggers to track when rows in a SQLite table were updated or deleted
25
+
26
+ ## Installation
27
+
28
+ ```bash
29
+ pip install sqlite-chronicle
30
+ ```
31
+
32
+ ## enable_chronicle(conn, table_name)
33
+
34
+ This module provides a single function: `sqlite_chronicle.enable_chronicle(conn, table_name)`, which does the following:
35
+
36
+ 1. Checks if a `_chronicle_{table_name}` table exists alreday. If so, it does nothing. Otherwise...
37
+ 2. Creates that table, with the same primary key columns as the original table plus integer columns `updated_ms` and `deleted`
38
+ 3. Creates a new row in the chronicle table corresponding to every row in the original table, setting `updated_ms` to the current timestamp in milliseconds
39
+ 4. Sets up three triggers on the table:
40
+ - An after insert trigger, which creates a new row in the chronicle table and sets `updated_ms` to the current time
41
+ - An after update trigger, which updates that timestamp and also updates any primary keys if they have changed (likely extremely rare)
42
+ - An after delete trigger, which updates that timestamp and places a `1` in the `deleted` column
43
+
44
+ The function will raise a `sqlite_chronicle.ChronicleError` exception if the table does not have a single or compound primary key.
45
+
46
+ The end result is a chronicle table that looks something like this:
47
+
48
+ | id | updated_ms | deleted |
49
+ |-----|---------------|---------|
50
+ | 47 | 1694408890954 | 0 |
51
+ | 48 | 1694408874863 | 1 |
52
+ | 1 | 1694408825192 | 0 |
53
+ | 2 | 1694408825192 | 0 |
54
+ | 3 | 1694408825192 | 0 |
55
+
56
+ ## Applications
57
+
58
+ Chronicle tables can be used to efficiently answer the question "what rows have been inserted, updated or deleted since I last checked".
59
+
60
+ This has numerous potential applications, including:
61
+
62
+ - Synchronization and replication: other databases can "subscribe" to tables, keeping track of when they last refreshed their copy and requesting just rows that changed since the last time - and deleting rows that have been marked as deleted.
63
+ - Indexing: if you need to update an Elasticsearch index or a vector database embeddings index or similar you can run against just the records that changed since your last run - see also [The denormalized query engine design pattern](https://2017.djangocon.us/talks/the-denormalized-query-engine-design-pattern/)
64
+ - Enrichments: [datasette-enrichments](https://github.com/datasette/datasette-enrichments) needs to to persist something that says "every address column should be geocoded" - then have an enrichment that runs every X seconds and looks for newly inserted or updated rows and enriches just those.
65
+ - Showing people what has changed since their last visit - "52 rows have been updated and 16 deleted since yesterday" kind of thing.
@@ -0,0 +1,9 @@
1
+ README.md
2
+ pyproject.toml
3
+ sqlite_chronicle.py
4
+ sqlite_chronicle.egg-info/PKG-INFO
5
+ sqlite_chronicle.egg-info/SOURCES.txt
6
+ sqlite_chronicle.egg-info/dependency_links.txt
7
+ sqlite_chronicle.egg-info/requires.txt
8
+ sqlite_chronicle.egg-info/top_level.txt
9
+ tests/test_sqlite_chronicle.py
@@ -0,0 +1,4 @@
1
+
2
+ [test]
3
+ pytest
4
+ sqlite-utils
@@ -0,0 +1 @@
1
+ sqlite_chronicle
@@ -0,0 +1,108 @@
1
+ import sqlite3
2
+ import textwrap
3
+
4
+
5
+ class ChronicleError(Exception):
6
+ pass
7
+
8
+
9
+ def enable_chronicle(conn: sqlite3.Connection, table_name: str):
10
+ c = conn.cursor()
11
+
12
+ # Check if the _chronicle_ table exists
13
+ c.execute(
14
+ f"SELECT name FROM sqlite_master WHERE type='table' AND name='_chronicle_{table_name}';"
15
+ )
16
+ if c.fetchone():
17
+ return
18
+
19
+ # Determine primary key columns and their types
20
+ c.execute(f'PRAGMA table_info("{table_name}");')
21
+ primary_key_columns = [
22
+ # cid, name, type, notnull, dflt_value, pk
23
+ (row[1], row[2])
24
+ for row in c.fetchall()
25
+ if row[5]
26
+ ]
27
+ if not primary_key_columns:
28
+ raise ChronicleError(f"Table {table_name} has no primary keys")
29
+
30
+ # Create the _chronicle_ table
31
+ pk_def = ", ".join(
32
+ [f'"{col_name}" {col_type}' for col_name, col_type in primary_key_columns]
33
+ )
34
+
35
+ with conn:
36
+ c.execute(
37
+ textwrap.dedent(
38
+ f"""
39
+ CREATE TABLE "_chronicle_{table_name}" (
40
+ {pk_def},
41
+ updated_ms INTEGER,
42
+ deleted INTEGER DEFAULT 0,
43
+ PRIMARY KEY ({', '.join([f'"{col[0]}"' for col in primary_key_columns])})
44
+ );
45
+ """
46
+ )
47
+ )
48
+ # Add an index on the updated_ms column
49
+ c.execute(
50
+ f"""
51
+ CREATE INDEX "_chronicle_{table_name}_updated_ms" ON "_chronicle_{table_name}" (updated_ms);
52
+ """.strip()
53
+ )
54
+
55
+ # Populate the _chronicle_ table with existing rows from the original table
56
+ current_time_expr = (
57
+ "CAST((julianday('now') - 2440587.5) * 86400 * 1000 AS INTEGER)"
58
+ )
59
+ c.execute(
60
+ f"""
61
+ INSERT INTO "_chronicle_{table_name}" ({', '.join([col[0] for col in primary_key_columns])}, updated_ms)
62
+ SELECT {', '.join([f'"{col[0]}"' for col in primary_key_columns])}, {current_time_expr}
63
+ FROM "{table_name}";
64
+ """
65
+ )
66
+
67
+ # Create the after insert trigger
68
+ c.execute(
69
+ f"""
70
+ CREATE TRIGGER "_chronicle_{table_name}_ai"
71
+ AFTER INSERT ON "{table_name}"
72
+ FOR EACH ROW
73
+ BEGIN
74
+ INSERT INTO "_chronicle_{table_name}" ({', '.join([f'"{col[0]}"' for col in primary_key_columns])}, updated_ms)
75
+ VALUES ({', '.join(['NEW.' + f'"{col[0]}"' for col in primary_key_columns])}, {current_time_expr});
76
+ END;
77
+ """
78
+ )
79
+
80
+ # Create the after update trigger
81
+ c.execute(
82
+ f"""
83
+ CREATE TRIGGER "_chronicle_{table_name}_au"
84
+ AFTER UPDATE ON "{table_name}"
85
+ FOR EACH ROW
86
+ BEGIN
87
+ UPDATE "_chronicle_{table_name}"
88
+ SET updated_ms = {current_time_expr},
89
+ -- Also update primary key columns if they have changed:
90
+ {', '.join([f'"{col[0]}" = NEW."{col[0]}"' for col in primary_key_columns])}
91
+ WHERE { ' AND '.join([f'"{col[0]}" = OLD."{col[0]}"' for col in primary_key_columns]) };
92
+ END;
93
+ """
94
+ )
95
+
96
+ # Create the after delete trigger
97
+ c.execute(
98
+ f"""
99
+ CREATE TRIGGER "_chronicle_{table_name}_ad"
100
+ AFTER DELETE ON "{table_name}"
101
+ FOR EACH ROW
102
+ BEGIN
103
+ UPDATE "_chronicle_{table_name}"
104
+ SET updated_ms = {current_time_expr}, deleted = 1
105
+ WHERE { ' AND '.join([f'"{col[0]}" = OLD."{col[0]}"' for col in primary_key_columns]) };
106
+ END;
107
+ """
108
+ )
@@ -0,0 +1,75 @@
1
+ import pytest
2
+ import sqlite_utils
3
+ from sqlite_chronicle import enable_chronicle
4
+ import time
5
+ from unittest.mock import ANY
6
+
7
+
8
+ @pytest.mark.parametrize("table_name", ("dogs", "dogs and stuff", "weird.table.name"))
9
+ @pytest.mark.parametrize("pks", (["id"], ["id", "name"]))
10
+ def test_enable_chronicle(table_name, pks):
11
+ chronicle_table = f"_chronicle_{table_name}"
12
+ db = sqlite_utils.Database(memory=True)
13
+ db[table_name].insert_all(
14
+ [
15
+ {"id": 1, "name": "Cleo", "color": "black"},
16
+ {"id": 2, "name": "Pancakes", "color": "corgi"},
17
+ ],
18
+ pk=pks[0] if len(pks) == 1 else pks,
19
+ )
20
+ enable_chronicle(db.conn, table_name)
21
+ # It should have the same primary keys
22
+ assert db[chronicle_table].pks == pks
23
+ # Should also have updated_ms and deleted columns
24
+ assert set(db[chronicle_table].columns_dict.keys()) == set(
25
+ pks + ["updated_ms", "deleted"]
26
+ )
27
+ # With an index
28
+ assert db[chronicle_table].indexes[0].columns == ["updated_ms"]
29
+ if pks == ["id"]:
30
+ expected = [
31
+ {"id": 1, "updated_ms": ANY, "deleted": 0},
32
+ {"id": 2, "updated_ms": ANY, "deleted": 0},
33
+ ]
34
+ else:
35
+ expected = [
36
+ {"id": 1, "name": "Cleo", "updated_ms": ANY, "deleted": 0},
37
+ {"id": 2, "name": "Pancakes", "updated_ms": ANY, "deleted": 0},
38
+ ]
39
+ assert list(db[chronicle_table].rows) == expected
40
+ # Running it again should do nothing because table exists
41
+ enable_chronicle(db.conn, table_name)
42
+ # Insert a row
43
+ db[table_name].insert({"id": 3, "name": "Mango", "color": "orange"})
44
+ get_by = 3 if pks == ["id"] else (3, "Mango")
45
+ row = db[chronicle_table].get(get_by)
46
+ if pks == ["id"]:
47
+ assert row == {"id": 3, "updated_ms": ANY, "deleted": 0}
48
+ else:
49
+ assert row == {"id": 3, "name": "Mango", "updated_ms": ANY, "deleted": 0}
50
+ record_timestamp = db[chronicle_table].get(get_by)["updated_ms"]
51
+ time.sleep(0.01)
52
+ # Update a row
53
+ db[table_name].update(get_by, {"color": "mango"})
54
+ assert db[chronicle_table].get(get_by)["updated_ms"] > record_timestamp
55
+ # Delete a row
56
+ assert db[table_name].count == 3
57
+ time.sleep(0.01)
58
+ db[table_name].delete(get_by)
59
+ assert db[table_name].count == 2
60
+ assert db[chronicle_table].get(get_by)["deleted"] == 1
61
+ new_record_timestamp = db[chronicle_table].get(get_by)["updated_ms"]
62
+ assert new_record_timestamp > record_timestamp
63
+ # Now update a column that's part of the compound primary key
64
+ time.sleep(0.1)
65
+ if pks == ["id", "name"]:
66
+ db[table_name].update((2, "Pancakes"), {"name": "Pancakes the corgi"})
67
+ # This should have renamed the row in the chronicle table as well
68
+ renamed_row = db[chronicle_table].get((2, "Pancakes the corgi"))
69
+ assert renamed_row["updated_ms"] > record_timestamp
70
+ else:
71
+ # Update single primary key
72
+ db[table_name].update(2, {"id": 4})
73
+ # This should have renamed the row in the chronicle table as well
74
+ renamed_row = db[chronicle_table].get(4)
75
+ assert renamed_row["updated_ms"] > record_timestamp