airbyte-source-commcare 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,102 @@
1
+ Metadata-Version: 2.1
2
+ Name: airbyte-source-commcare
3
+ Version: 0.1.0
4
+ Summary: Source implementation for Commcare.
5
+ Author: Airbyte
6
+ Author-email: contact@airbyte.io
7
+ Description-Content-Type: text/markdown
8
+ Requires-Dist: airbyte-cdk
9
+ Requires-Dist: bigquery-schema-generator ~=1.5
10
+ Requires-Dist: gbqschema-converter ~=1.2.0
11
+ Requires-Dist: flatten-json ~=0.1.13
12
+ Provides-Extra: tests
13
+ Requires-Dist: requests-mock ~=1.9.3 ; extra == 'tests'
14
+ Requires-Dist: pytest ~=6.1 ; extra == 'tests'
15
+ Requires-Dist: pytest-mock ~=3.6.1 ; extra == 'tests'
16
+
17
+ # Commcare Source
18
+
19
+ This is the repository for the Commcare source connector, written in Python.
20
+ For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.io/integrations/sources/commcare).
21
+
22
+
23
+ **To iterate on this connector, make sure to complete this prerequisites section.**
24
+
25
+
26
+ From this connector directory, create a virtual environment:
27
+ ```
28
+ python -m venv .venv
29
+ ```
30
+
31
+ This will generate a virtualenv for this module in `.venv/`. Make sure this venv is active in your
32
+ development environment of choice. To activate it from the terminal, run:
33
+ ```
34
+ source .venv/bin/activate
35
+ pip install -r requirements.txt
36
+ pip install '.[tests]'
37
+ ```
38
+ If you are in an IDE, follow your IDE's instructions to activate the virtualenv.
39
+
40
+ Note that while we are installing dependencies from `requirements.txt`, you should only edit `setup.py` for your dependencies. `requirements.txt` is
41
+ used for editable installs (`pip install -e`) to pull in Python dependencies from the monorepo and will call `setup.py`.
42
+ If this is mumbo jumbo to you, don't worry about it, just put your deps in `setup.py` but install using `pip install -r requirements.txt` and everything
43
+ should work as you expect.
44
+
45
+ **If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/commcare)
46
+ to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_commcare/spec.yaml` file.
47
+ Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
48
+ See `integration_tests/sample_config.json` for a sample config file.
49
+
50
+ **If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source commcare test creds`
51
+ and place them into `secrets/config.json`.
52
+
53
+ ```
54
+ python main.py spec
55
+ python main.py check --config secrets/config.json
56
+ python main.py discover --config secrets/config.json
57
+ python main.py read --config secrets/config.json --catalog integration_tests/configured_catalog.json
58
+ ```
59
+
60
+
61
+
62
+ **Via [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md) (recommended):**
63
+ ```bash
64
+ airbyte-ci connectors --name=source-commcare build
65
+ ```
66
+
67
+ An image will be built with the tag `airbyte/source-commcare:dev`.
68
+
69
+ **Via `docker build`:**
70
+ ```bash
71
+ docker build -t airbyte/source-commcare:dev .
72
+ ```
73
+
74
+ Then run any of the connector commands as follows:
75
+ ```
76
+ docker run --rm airbyte/source-commcare:dev spec
77
+ docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-commcare:dev check --config /secrets/config.json
78
+ docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-commcare:dev discover --config /secrets/config.json
79
+ docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-commcare:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
80
+ ```
81
+
82
+ You can run our full test suite locally using [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md):
83
+ ```bash
84
+ airbyte-ci connectors --name=source-commcare test
85
+ ```
86
+
87
+ Customize `acceptance-test-config.yml` file to configure tests. See [Connector Acceptance Tests](https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference) for more information.
88
+ If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py.
89
+
90
+ All of your dependencies should go in `setup.py`, NOT `requirements.txt`. The requirements file is only used to connect internal Airbyte dependencies in the monorepo for local development.
91
+ We split dependencies between two groups, dependencies that are:
92
+ * required for your connector to work need to go to `MAIN_REQUIREMENTS` list.
93
+ * required for the testing need to go to `TEST_REQUIREMENTS` list
94
+
95
+ You've checked out the repo, implemented a million dollar feature, and you're ready to share your changes with the world. Now what?
96
+ 1. Make sure your changes are passing our test suite: `airbyte-ci connectors --name=source-commcare test`
97
+ 2. Bump the connector version in `metadata.yaml`: increment the `dockerImageTag` value. Please follow [semantic versioning for connectors](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#semantic-versioning-for-connectors).
98
+ 3. Make sure the `metadata.yaml` content is up to date.
99
+ 4. Make the connector documentation and its changelog is up to date (`docs/integrations/sources/commcare.md`).
100
+ 5. Create a Pull Request: use [our PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#pull-request-title-convention).
101
+ 6. Pat yourself on the back for being an awesome contributor.
102
+ 7. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
@@ -0,0 +1,19 @@
1
+ integration_tests/__init__.py,sha256=4Hw-PX1-VgESLF16cDdvuYCzGJtHntThLF4qIiULWeo,61
2
+ integration_tests/abnormal_state.json,sha256=-DJMcBhq1do1ntOZXzsaGmxvbDTmYmFS0cSSaUoRLZ4,86
3
+ integration_tests/acceptance.py,sha256=8eU9iSDbmHyufPvAouJGhPMgPAFTCP8IKIKHLm7u5TE,435
4
+ integration_tests/catalog.json,sha256=yj0WO6sFU4GCciYUBWjzvvfqrBh869doeOC2Pp5EI1Y,3
5
+ integration_tests/configured_catalog.json,sha256=1tbuuekIkjKW6NKdwW9ulYAYNjErYgAJqDsq08etQLE,397
6
+ integration_tests/invalid_config.json,sha256=kXy1C-2u6aW122jeDBlv1taNZDYjzdjRyer_2B1xgk8,142
7
+ integration_tests/sample_config.json,sha256=Lh9_zss5lxbqGFYpZnvi8qo5eIMNcuQ6GBDKbw1dVhM,115
8
+ integration_tests/sample_state.json,sha256=HBR6G9QPoPDFAlfAKwRmeNaQ5_56xehMLtwWr4mxd5k,63
9
+ source_commcare/__init__.py,sha256=e2gcZC5sfkJ_uidHqoPCVmqUdL9mhpMrw4Ix4uIzSng,128
10
+ source_commcare/run.py,sha256=aZZ10q2rM_fY8lmNk4Gmb_pOhDJvv-biHN6V-ut5Y6I,236
11
+ source_commcare/source.py,sha256=X_4bNim1LBDCVjqnhWViaFgsjPvMTXkZZ1yNRGRf1Aw,12956
12
+ source_commcare/spec.yaml,sha256=CMpevuCDKMFZmq16CdeilpvXKbCiQPjmI1fcgzvJQuQ,1019
13
+ unit_tests/__init__.py,sha256=4Hw-PX1-VgESLF16cDdvuYCzGJtHntThLF4qIiULWeo,61
14
+ unit_tests/test_source.py,sha256=qneLJGlvcpoyFlnvwzwP2voDtzi8z6c_N-lu0won5wM,682
15
+ airbyte_source_commcare-0.1.0.dist-info/METADATA,sha256=-GDqr4YKVK8LGyoTB9nmPitvdARyMlTt9LOzmE5dJUc,5536
16
+ airbyte_source_commcare-0.1.0.dist-info/WHEEL,sha256=oiQVh_5PnQM0E3gPdiz09WCNmwiHDMaGer_elqB3coM,92
17
+ airbyte_source_commcare-0.1.0.dist-info/entry_points.txt,sha256=zVJihl1jgNaayMgxgA1G9KN85LZ9T8bbss4VfqE78Bg,60
18
+ airbyte_source_commcare-0.1.0.dist-info/top_level.txt,sha256=WCua9CoU5uJ26Af2zVYpamondw5woLFReykWLPCjX7s,45
19
+ airbyte_source_commcare-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: bdist_wheel (0.42.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ source-commcare = source_commcare.run:run
@@ -0,0 +1,3 @@
1
+ integration_tests
2
+ source_commcare
3
+ unit_tests
@@ -0,0 +1,3 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
@@ -0,0 +1,5 @@
1
+ {
2
+ "Assess a referred patient": {
3
+ "indexed_on": "2023-11-25T20:30:30.2423"
4
+ }
5
+ }
@@ -0,0 +1,16 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ import pytest
7
+
8
+ pytest_plugins = ("connector_acceptance_test.plugin",)
9
+
10
+
11
+ @pytest.fixture(scope="session", autouse=True)
12
+ def connector_setup():
13
+ """This fixture is a placeholder for external resources that acceptance test might require."""
14
+ # TODO: setup test dependencies if needed. otherwise remove the TODO comments
15
+ yield
16
+ # TODO: clean up test dependencies
@@ -0,0 +1 @@
1
+ {}
@@ -0,0 +1,16 @@
1
+ {
2
+ "streams": [
3
+ {
4
+ "stream": {
5
+ "name": "Assess a referred patient",
6
+ "json_schema": {},
7
+ "supported_sync_modes": ["full_refresh", "incremental"],
8
+ "source_defined_cursor": true,
9
+ "default_cursor_field": ["indexed_on"]
10
+ },
11
+ "sync_mode": "incremental",
12
+ "cursor_field": ["indexed_on"],
13
+ "destination_sync_mode": "append"
14
+ }
15
+ ]
16
+ }
@@ -0,0 +1,6 @@
1
+ {
2
+ "app_id": "wrong app_id",
3
+ "api_key": "wrong api key",
4
+ "start_date": "This has the wrong format",
5
+ "project_space": "project_space"
6
+ }
@@ -0,0 +1,6 @@
1
+ {
2
+ "app_id": "App ID",
3
+ "api_key": "API KEY",
4
+ "project_space": "project_space",
5
+ "start_date": "Start Date"
6
+ }
@@ -0,0 +1,5 @@
1
+ {
2
+ "todo-stream-name": {
3
+ "todo-field-name": "value"
4
+ }
5
+ }
@@ -0,0 +1,8 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ from .source import SourceCommcare
7
+
8
+ __all__ = ["SourceCommcare"]
source_commcare/run.py ADDED
@@ -0,0 +1,14 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ import sys
7
+
8
+ from airbyte_cdk.entrypoint import launch
9
+ from source_commcare import SourceCommcare
10
+
11
+
12
+ def run():
13
+ source = SourceCommcare()
14
+ launch(source, sys.argv[1:])
@@ -0,0 +1,337 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+ import re
6
+ from abc import ABC
7
+ from datetime import datetime
8
+ from typing import Any, Iterable, List, Mapping, MutableMapping, Optional, Tuple
9
+ from urllib.parse import parse_qs
10
+
11
+ import requests
12
+ from airbyte_cdk.models import SyncMode
13
+ from airbyte_cdk.sources import AbstractSource
14
+ from airbyte_cdk.sources.streams import IncrementalMixin, Stream
15
+ from airbyte_cdk.sources.streams.http import HttpStream
16
+ from airbyte_cdk.sources.streams.http.requests_native_auth import TokenAuthenticator
17
+ from flatten_json import flatten
18
+
19
+
20
+ # Basic full refresh stream
21
+ class CommcareStream(HttpStream, ABC):
22
+ def __init__(self, project_space, **kwargs):
23
+ super().__init__(**kwargs)
24
+ self.project_space = project_space
25
+
26
+ @property
27
+ def url_base(self) -> str:
28
+ return f"https://www.commcarehq.org/a/{self.project_space}/api/v0.5/"
29
+
30
+ # These class variables save state
31
+ # forms holds form ids and we filter cases which contain one of these form ids
32
+ # last_form_date stores the date of the last form read so the next cycle for forms and cases starts at the same timestamp
33
+ forms = set()
34
+ last_form_date = None
35
+ schemas = {}
36
+ unwantedfields = re.compile(r"^(case_|update_|meta|create_|commcare_).*$")
37
+
38
+ @property
39
+ def dateformat(self):
40
+ return "%Y-%m-%dT%H:%M:%S.%f"
41
+
42
+ def scrubUnwantedFields(self, form):
43
+ newform = {k: v for k, v in form.items() if not self.unwantedfields.match(k)}
44
+ return newform
45
+
46
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
47
+ try:
48
+ # Server returns status 500 when there are no more rows.
49
+ # raise an error if server returns an error
50
+ response.raise_for_status()
51
+ meta = response.json()["meta"]
52
+ return parse_qs(meta["next"][1:])
53
+ except Exception as ex:
54
+ return ex
55
+
56
+ def request_params(
57
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
58
+ ) -> MutableMapping[str, Any]:
59
+
60
+ params = {"format": "json"}
61
+ return params
62
+
63
+
64
+ class Application(CommcareStream):
65
+ primary_key = "id"
66
+
67
+ def __init__(self, app_id, **kwargs):
68
+ super().__init__(**kwargs)
69
+ self.app_id = app_id
70
+
71
+ def path(
72
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
73
+ ) -> str:
74
+ return f"application/{self.app_id}/"
75
+
76
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
77
+ return None
78
+
79
+ def request_params(
80
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
81
+ ) -> MutableMapping[str, Any]:
82
+
83
+ params = {"format": "json", "extras": "true"}
84
+ return params
85
+
86
+ def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:
87
+ yield response.json()
88
+
89
+
90
+ class IncrementalStream(CommcareStream, IncrementalMixin):
91
+ cursor_field = "indexed_on"
92
+ _cursor_value = None
93
+
94
+ @property
95
+ def state(self) -> Mapping[str, Any]:
96
+ if self._cursor_value:
97
+ return {self.cursor_field: self._cursor_value}
98
+
99
+ @state.setter
100
+ def state(self, value: Mapping[str, Any]):
101
+ self._cursor_value = datetime.strptime(value[self.cursor_field], self.dateformat)
102
+
103
+ @property
104
+ def sync_mode(self):
105
+ return SyncMode.incremental
106
+
107
+ @property
108
+ def supported_sync_modes(self):
109
+ return [SyncMode.incremental]
110
+
111
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
112
+ try:
113
+ # Server returns status 500 when there are no more rows.
114
+ # raise an error if server returns an error
115
+ response.raise_for_status()
116
+ meta = response.json()["meta"]
117
+ if meta["next"]:
118
+ return parse_qs(meta["next"][1:])
119
+ return None
120
+ except Exception:
121
+ return None
122
+
123
+ def request_params(
124
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
125
+ ) -> MutableMapping[str, Any]:
126
+
127
+ params = {"format": "json"}
128
+ if next_page_token:
129
+ params.update(next_page_token)
130
+ return params
131
+
132
+ def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:
133
+ for o in iter(response.json()["objects"]):
134
+ yield o
135
+ return None
136
+
137
+
138
+ class Case(IncrementalStream):
139
+
140
+ """
141
+ docs: https://www.commcarehq.org/a/[domain]/api/[version]/case/
142
+ """
143
+
144
+ cursor_field = "indexed_on"
145
+ primary_key = "id"
146
+
147
+ def __init__(self, start_date, app_id, schema, **kwargs):
148
+ super().__init__(**kwargs)
149
+ self._cursor_value = datetime.strptime(start_date, "%Y-%m-%dT%H:%M:%SZ")
150
+ self.schema = schema
151
+
152
+ def get_json_schema(self):
153
+ return self.schema
154
+
155
+ @property
156
+ def name(self):
157
+ # Airbyte orders streams in alpha order but since we have dependent peers and we need to
158
+ # pull all forms before cases, we name this stream to
159
+ # ensure this stream gets pulled last (assuming ascii stream names only)
160
+ return "zzz_case"
161
+
162
+ def path(
163
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
164
+ ) -> str:
165
+ return "case"
166
+
167
+ def request_params(
168
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
169
+ ) -> MutableMapping[str, Any]:
170
+
171
+ # start date is what we saved for forms
172
+ # if self.cursor_field in self.state else (CommcareStream.last_form_date or self.initial_date)
173
+ ix = self.state[self.cursor_field]
174
+ params = {"format": "json", "indexed_on_start": ix.strftime(self.dateformat), "order_by": "indexed_on", "limit": "5000"}
175
+ if next_page_token:
176
+ params.update(next_page_token)
177
+ return params
178
+
179
+ def read_records(self, *args, **kwargs) -> Iterable[Mapping[str, Any]]:
180
+ for record in super().read_records(*args, **kwargs):
181
+ found = False
182
+ for f in record["xform_ids"]:
183
+ if f in CommcareStream.forms:
184
+ found = True
185
+ break
186
+ if found:
187
+ self._cursor_value = datetime.strptime(record[self.cursor_field], self.dateformat)
188
+ # Make indexed_on tz aware
189
+ record.update({"streamname": "case", "indexed_on": record["indexed_on"] + "Z"})
190
+ # convert xform_ids field from array to comma separated list so flattening won't create
191
+ # one field per item. This is because some cases have up to 2000 xform_ids and we don't want 2000 extra
192
+ # fields in the schema
193
+ record["xform_ids"] = ",".join(record["xform_ids"])
194
+ frec = flatten(record)
195
+ yield frec
196
+ if self._cursor_value.microsecond == 0:
197
+ # Airbyte converts the cursor_field value (datetime) to string when it saves the state and
198
+ # our state setter parses the saved state with a format that contains microseconds
199
+ # self._cursor_value must have non-zero microseconds for the formatting and parsing to work correctly.
200
+ # This issue would also occur if an incoming record had a timestamp with zero microseconds
201
+ self._cursor_value = self._cursor_value.replace(microsecond=10)
202
+ # This cycle of pull is complete so clear out the form ids we saved for this cycle
203
+ CommcareStream.forms.clear()
204
+
205
+
206
+ class Form(IncrementalStream):
207
+ """
208
+ docs: https://www.commcarehq.org/a/[domain]/api/[version]/form/
209
+ """
210
+
211
+ cursor_field = "indexed_on"
212
+ primary_key = "id"
213
+
214
+ def __init__(self, start_date, app_id, name, xmlns, schema, **kwargs):
215
+ super().__init__(**kwargs)
216
+ self.app_id = app_id
217
+ self._cursor_value = datetime.strptime(start_date, "%Y-%m-%dT%H:%M:%SZ")
218
+ self.streamname = name
219
+ self.xmlns = xmlns
220
+ self.schema = schema
221
+
222
+ @property
223
+ def name(self):
224
+ return self.streamname
225
+
226
+ def get_json_schema(self):
227
+ return self.schema
228
+
229
+ def path(
230
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
231
+ ) -> str:
232
+ return "form"
233
+
234
+ def request_params(
235
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
236
+ ) -> MutableMapping[str, Any]:
237
+
238
+ # if self.cursor_field in self.state else self.initial_date
239
+ ix = self.state[self.cursor_field]
240
+ params = {
241
+ "format": "json",
242
+ "app_id": self.app_id,
243
+ "indexed_on_start": ix.strftime(self.dateformat),
244
+ "order_by": "indexed_on",
245
+ "limit": "1000",
246
+ "xmlns": self.xmlns,
247
+ }
248
+ if next_page_token:
249
+ params.update(next_page_token)
250
+ return params
251
+
252
+ def read_records(self, *args, **kwargs) -> Iterable[Mapping[str, Any]]:
253
+ upd = {"streamname": self.streamname, "xmlns": self.xmlns}
254
+ for record in super().read_records(*args, **kwargs):
255
+ self._cursor_value = datetime.strptime(record[self.cursor_field], self.dateformat)
256
+ CommcareStream.forms.add(record["id"])
257
+ form = record["form"]
258
+ form.update(upd)
259
+ # Append Z to make it timezone aware
260
+ form.update({"id": record["id"], "indexed_on": record["indexed_on"] + "Z"})
261
+ newform = self.scrubUnwantedFields(form)
262
+ yield flatten(newform)
263
+ if self._cursor_value.microsecond == 0:
264
+ # Airbyte converts the cursor_field value (datetime) to string when it saves the state and
265
+ # our state setter parses the saved state with a format that contains microseconds
266
+ # self._cursor_value must have non-zero microseconds for the formatting and parsing to work correctly.
267
+ # This issue would also occur if an incoming record had a timestamp with zero microseconds
268
+ self._cursor_value = self._cursor_value.replace(microsecond=10)
269
+
270
+
271
+ # Source
272
+ class SourceCommcare(AbstractSource):
273
+ def check_connection(self, logger, config) -> Tuple[bool, any]:
274
+ if "api_key" not in config:
275
+ return False, None
276
+ return True, None
277
+
278
+ def base_schema(self):
279
+ return {
280
+ "$schema": "http://json-schema.org/draft-07/schema#",
281
+ "type": "object",
282
+ "properties": {"id": {"type": "string"}, "indexed_on": {"type": "string", "format": "date-time"}},
283
+ }
284
+
285
+ def streams(self, config: Mapping[str, Any]) -> List[Stream]:
286
+ auth = TokenAuthenticator(config["api_key"], auth_method="ApiKey")
287
+ args = {
288
+ "authenticator": auth,
289
+ }
290
+ appdata = Application(**{**args, "app_id": config["app_id"], "project_space": config["project_space"]}).read_records(
291
+ sync_mode=SyncMode.full_refresh
292
+ )
293
+
294
+ # Generate streams for forms, one per xmlns and one stream for cases.
295
+ streams = self.generate_streams(args, config, appdata)
296
+ return streams
297
+
298
+ def generate_streams(self, args, config, appdata):
299
+ form_args = {"app_id": config["app_id"], "start_date": config["start_date"], "project_space": config["project_space"], **args}
300
+ streams = []
301
+ name2xmlns = {}
302
+
303
+ # Collect the form names and xmlns from the application
304
+ for record in appdata:
305
+ mods = record["modules"]
306
+ for m in mods:
307
+ forms = m["forms"]
308
+ for f in forms:
309
+ xmlns = f["xmlns"]
310
+ formname = ""
311
+ if "en" in f["name"]:
312
+ formname = f["name"]["en"].strip()
313
+ else:
314
+ # Unknown forms are named UNNAMED_xxxxx where xxxxx are the last 5 difits of the XMLNS
315
+ # This convention gives us repeatable names
316
+ formname = f"Unnamed_{xmlns[-5:]}"
317
+
318
+ name = formname
319
+ name2xmlns[name] = xmlns
320
+
321
+ # Create the streams from the collected names
322
+ # Sorted by name
323
+ for k in sorted(name2xmlns):
324
+ key = name2xmlns[k]
325
+ stream = Form(name=k, xmlns=key, schema=self.base_schema(), **form_args)
326
+ streams.append(stream)
327
+
328
+ stream = Case(
329
+ app_id=config["app_id"],
330
+ start_date=config["start_date"],
331
+ schema=self.base_schema(),
332
+ project_space=config["project_space"],
333
+ **args,
334
+ )
335
+ streams.append(stream)
336
+
337
+ return streams
@@ -0,0 +1,38 @@
1
+ documentationUrl: https://docsurl.com
2
+ connectionSpecification:
3
+ $schema: http://json-schema.org/draft-07/schema#
4
+ title: Commcare Source Spec
5
+ type: object
6
+ required:
7
+ - api_key
8
+ - app_id
9
+ - start_date
10
+ properties:
11
+ api_key:
12
+ type: string
13
+ title: API Key
14
+ description: >-
15
+ Commcare API Key
16
+ airbyte_secret: true
17
+ order: 0
18
+ project_space:
19
+ type: string
20
+ title: Project Space
21
+ description: >-
22
+ Project Space for commcare
23
+ order: 1
24
+ app_id:
25
+ type: string
26
+ title: Application ID
27
+ description: >-
28
+ The Application ID we are interested in
29
+ airbyte_secret: true
30
+ order: 2
31
+ start_date:
32
+ type: string
33
+ title: Start date for extracting records
34
+ pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$
35
+ default: "2022-10-01T00:00:00Z"
36
+ description: >-
37
+ UTC date and time in the format 2017-01-25T00:00:00Z. Only records after this date will be replicated.
38
+ order: 3
unit_tests/__init__.py ADDED
@@ -0,0 +1,3 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
@@ -0,0 +1,25 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+ from unittest.mock import MagicMock, Mock
6
+
7
+ import pytest
8
+ from source_commcare.source import SourceCommcare
9
+
10
+
11
+ @pytest.fixture(name="config")
12
+ def config_fixture():
13
+ return {"api_key": "apikey", "app_id": "appid", "start_date": "2022-01-01T00:00:00Z"}
14
+
15
+
16
+ def test_check_connection_ok(mocker, config):
17
+ source = SourceCommcare()
18
+ logger_mock = Mock()
19
+ assert source.check_connection(logger_mock, config=config) == (True, None)
20
+
21
+
22
+ def test_check_connection_fail(mocker, config):
23
+ source = SourceCommcare()
24
+ logger_mock = MagicMock()
25
+ assert source.check_connection(logger_mock, config={}) == (False, None)