airbyte-source-commcare 0.1.0__py3-none-any.whl

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,102 @@
1
+ Metadata-Version: 2.1
2
+ Name: airbyte-source-commcare
3
+ Version: 0.1.0
4
+ Summary: Source implementation for Commcare.
5
+ Author: Airbyte
6
+ Author-email: contact@airbyte.io
7
+ Description-Content-Type: text/markdown
8
+ Requires-Dist: airbyte-cdk
9
+ Requires-Dist: bigquery-schema-generator ~=1.5
10
+ Requires-Dist: gbqschema-converter ~=1.2.0
11
+ Requires-Dist: flatten-json ~=0.1.13
12
+ Provides-Extra: tests
13
+ Requires-Dist: requests-mock ~=1.9.3 ; extra == 'tests'
14
+ Requires-Dist: pytest ~=6.1 ; extra == 'tests'
15
+ Requires-Dist: pytest-mock ~=3.6.1 ; extra == 'tests'
16
+
17
+ # Commcare Source
18
+
19
+ This is the repository for the Commcare source connector, written in Python.
20
+ For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.io/integrations/sources/commcare).
21
+
22
+
23
+ **To iterate on this connector, make sure to complete this prerequisites section.**
24
+
25
+
26
+ From this connector directory, create a virtual environment:
27
+ ```
28
+ python -m venv .venv
29
+ ```
30
+
31
+ This will generate a virtualenv for this module in `.venv/`. Make sure this venv is active in your
32
+ development environment of choice. To activate it from the terminal, run:
33
+ ```
34
+ source .venv/bin/activate
35
+ pip install -r requirements.txt
36
+ pip install '.[tests]'
37
+ ```
38
+ If you are in an IDE, follow your IDE's instructions to activate the virtualenv.
39
+
40
+ Note that while we are installing dependencies from `requirements.txt`, you should only edit `setup.py` for your dependencies. `requirements.txt` is
41
+ used for editable installs (`pip install -e`) to pull in Python dependencies from the monorepo and will call `setup.py`.
42
+ If this is mumbo jumbo to you, don't worry about it, just put your deps in `setup.py` but install using `pip install -r requirements.txt` and everything
43
+ should work as you expect.
44
+
45
+ **If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/commcare)
46
+ to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_commcare/spec.yaml` file.
47
+ Note that any directory named `secrets` is gitignored across the entire Airbyte repo, so there is no danger of accidentally checking in sensitive information.
48
+ See `integration_tests/sample_config.json` for a sample config file.
49
+
50
+ **If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source commcare test creds`
51
+ and place them into `secrets/config.json`.
52
+
53
+ ```
54
+ python main.py spec
55
+ python main.py check --config secrets/config.json
56
+ python main.py discover --config secrets/config.json
57
+ python main.py read --config secrets/config.json --catalog integration_tests/configured_catalog.json
58
+ ```
59
+
60
+
61
+
62
+ **Via [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md) (recommended):**
63
+ ```bash
64
+ airbyte-ci connectors --name=source-commcare build
65
+ ```
66
+
67
+ An image will be built with the tag `airbyte/source-commcare:dev`.
68
+
69
+ **Via `docker build`:**
70
+ ```bash
71
+ docker build -t airbyte/source-commcare:dev .
72
+ ```
73
+
74
+ Then run any of the connector commands as follows:
75
+ ```
76
+ docker run --rm airbyte/source-commcare:dev spec
77
+ docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-commcare:dev check --config /secrets/config.json
78
+ docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-commcare:dev discover --config /secrets/config.json
79
+ docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-commcare:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
80
+ ```
81
+
82
+ You can run our full test suite locally using [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md):
83
+ ```bash
84
+ airbyte-ci connectors --name=source-commcare test
85
+ ```
86
+
87
+ Customize `acceptance-test-config.yml` file to configure tests. See [Connector Acceptance Tests](https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference) for more information.
88
+ If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py.
89
+
90
+ All of your dependencies should go in `setup.py`, NOT `requirements.txt`. The requirements file is only used to connect internal Airbyte dependencies in the monorepo for local development.
91
+ We split dependencies between two groups, dependencies that are:
92
+ * required for your connector to work need to go to `MAIN_REQUIREMENTS` list.
93
+ * required for the testing need to go to `TEST_REQUIREMENTS` list
94
+
95
+ You've checked out the repo, implemented a million dollar feature, and you're ready to share your changes with the world. Now what?
96
+ 1. Make sure your changes are passing our test suite: `airbyte-ci connectors --name=source-commcare test`
97
+ 2. Bump the connector version in `metadata.yaml`: increment the `dockerImageTag` value. Please follow [semantic versioning for connectors](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#semantic-versioning-for-connectors).
98
+ 3. Make sure the `metadata.yaml` content is up to date.
99
+ 4. Make the connector documentation and its changelog is up to date (`docs/integrations/sources/commcare.md`).
100
+ 5. Create a Pull Request: use [our PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook/#pull-request-title-convention).
101
+ 6. Pat yourself on the back for being an awesome contributor.
102
+ 7. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
@@ -0,0 +1,19 @@
1
+ integration_tests/__init__.py,sha256=4Hw-PX1-VgESLF16cDdvuYCzGJtHntThLF4qIiULWeo,61
2
+ integration_tests/abnormal_state.json,sha256=-DJMcBhq1do1ntOZXzsaGmxvbDTmYmFS0cSSaUoRLZ4,86
3
+ integration_tests/acceptance.py,sha256=8eU9iSDbmHyufPvAouJGhPMgPAFTCP8IKIKHLm7u5TE,435
4
+ integration_tests/catalog.json,sha256=yj0WO6sFU4GCciYUBWjzvvfqrBh869doeOC2Pp5EI1Y,3
5
+ integration_tests/configured_catalog.json,sha256=1tbuuekIkjKW6NKdwW9ulYAYNjErYgAJqDsq08etQLE,397
6
+ integration_tests/invalid_config.json,sha256=kXy1C-2u6aW122jeDBlv1taNZDYjzdjRyer_2B1xgk8,142
7
+ integration_tests/sample_config.json,sha256=Lh9_zss5lxbqGFYpZnvi8qo5eIMNcuQ6GBDKbw1dVhM,115
8
+ integration_tests/sample_state.json,sha256=HBR6G9QPoPDFAlfAKwRmeNaQ5_56xehMLtwWr4mxd5k,63
9
+ source_commcare/__init__.py,sha256=e2gcZC5sfkJ_uidHqoPCVmqUdL9mhpMrw4Ix4uIzSng,128
10
+ source_commcare/run.py,sha256=aZZ10q2rM_fY8lmNk4Gmb_pOhDJvv-biHN6V-ut5Y6I,236
11
+ source_commcare/source.py,sha256=X_4bNim1LBDCVjqnhWViaFgsjPvMTXkZZ1yNRGRf1Aw,12956
12
+ source_commcare/spec.yaml,sha256=CMpevuCDKMFZmq16CdeilpvXKbCiQPjmI1fcgzvJQuQ,1019
13
+ unit_tests/__init__.py,sha256=4Hw-PX1-VgESLF16cDdvuYCzGJtHntThLF4qIiULWeo,61
14
+ unit_tests/test_source.py,sha256=qneLJGlvcpoyFlnvwzwP2voDtzi8z6c_N-lu0won5wM,682
15
+ airbyte_source_commcare-0.1.0.dist-info/METADATA,sha256=-GDqr4YKVK8LGyoTB9nmPitvdARyMlTt9LOzmE5dJUc,5536
16
+ airbyte_source_commcare-0.1.0.dist-info/WHEEL,sha256=oiQVh_5PnQM0E3gPdiz09WCNmwiHDMaGer_elqB3coM,92
17
+ airbyte_source_commcare-0.1.0.dist-info/entry_points.txt,sha256=zVJihl1jgNaayMgxgA1G9KN85LZ9T8bbss4VfqE78Bg,60
18
+ airbyte_source_commcare-0.1.0.dist-info/top_level.txt,sha256=WCua9CoU5uJ26Af2zVYpamondw5woLFReykWLPCjX7s,45
19
+ airbyte_source_commcare-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: bdist_wheel (0.42.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ source-commcare = source_commcare.run:run
@@ -0,0 +1,3 @@
1
+ integration_tests
2
+ source_commcare
3
+ unit_tests
@@ -0,0 +1,3 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
@@ -0,0 +1,5 @@
1
+ {
2
+ "Assess a referred patient": {
3
+ "indexed_on": "2023-11-25T20:30:30.2423"
4
+ }
5
+ }
@@ -0,0 +1,16 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ import pytest
7
+
8
+ pytest_plugins = ("connector_acceptance_test.plugin",)
9
+
10
+
11
+ @pytest.fixture(scope="session", autouse=True)
12
+ def connector_setup():
13
+ """This fixture is a placeholder for external resources that acceptance test might require."""
14
+ # TODO: setup test dependencies if needed. otherwise remove the TODO comments
15
+ yield
16
+ # TODO: clean up test dependencies
@@ -0,0 +1 @@
1
+ {}
@@ -0,0 +1,16 @@
1
+ {
2
+ "streams": [
3
+ {
4
+ "stream": {
5
+ "name": "Assess a referred patient",
6
+ "json_schema": {},
7
+ "supported_sync_modes": ["full_refresh", "incremental"],
8
+ "source_defined_cursor": true,
9
+ "default_cursor_field": ["indexed_on"]
10
+ },
11
+ "sync_mode": "incremental",
12
+ "cursor_field": ["indexed_on"],
13
+ "destination_sync_mode": "append"
14
+ }
15
+ ]
16
+ }
@@ -0,0 +1,6 @@
1
+ {
2
+ "app_id": "wrong app_id",
3
+ "api_key": "wrong api key",
4
+ "start_date": "This has the wrong format",
5
+ "project_space": "project_space"
6
+ }
@@ -0,0 +1,6 @@
1
+ {
2
+ "app_id": "App ID",
3
+ "api_key": "API KEY",
4
+ "project_space": "project_space",
5
+ "start_date": "Start Date"
6
+ }
@@ -0,0 +1,5 @@
1
+ {
2
+ "todo-stream-name": {
3
+ "todo-field-name": "value"
4
+ }
5
+ }
@@ -0,0 +1,8 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ from .source import SourceCommcare
7
+
8
+ __all__ = ["SourceCommcare"]
source_commcare/run.py ADDED
@@ -0,0 +1,14 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+
6
+ import sys
7
+
8
+ from airbyte_cdk.entrypoint import launch
9
+ from source_commcare import SourceCommcare
10
+
11
+
12
+ def run():
13
+ source = SourceCommcare()
14
+ launch(source, sys.argv[1:])
@@ -0,0 +1,337 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+ import re
6
+ from abc import ABC
7
+ from datetime import datetime
8
+ from typing import Any, Iterable, List, Mapping, MutableMapping, Optional, Tuple
9
+ from urllib.parse import parse_qs
10
+
11
+ import requests
12
+ from airbyte_cdk.models import SyncMode
13
+ from airbyte_cdk.sources import AbstractSource
14
+ from airbyte_cdk.sources.streams import IncrementalMixin, Stream
15
+ from airbyte_cdk.sources.streams.http import HttpStream
16
+ from airbyte_cdk.sources.streams.http.requests_native_auth import TokenAuthenticator
17
+ from flatten_json import flatten
18
+
19
+
20
+ # Basic full refresh stream
21
+ class CommcareStream(HttpStream, ABC):
22
+ def __init__(self, project_space, **kwargs):
23
+ super().__init__(**kwargs)
24
+ self.project_space = project_space
25
+
26
+ @property
27
+ def url_base(self) -> str:
28
+ return f"https://www.commcarehq.org/a/{self.project_space}/api/v0.5/"
29
+
30
+ # These class variables save state
31
+ # forms holds form ids and we filter cases which contain one of these form ids
32
+ # last_form_date stores the date of the last form read so the next cycle for forms and cases starts at the same timestamp
33
+ forms = set()
34
+ last_form_date = None
35
+ schemas = {}
36
+ unwantedfields = re.compile(r"^(case_|update_|meta|create_|commcare_).*$")
37
+
38
+ @property
39
+ def dateformat(self):
40
+ return "%Y-%m-%dT%H:%M:%S.%f"
41
+
42
+ def scrubUnwantedFields(self, form):
43
+ newform = {k: v for k, v in form.items() if not self.unwantedfields.match(k)}
44
+ return newform
45
+
46
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
47
+ try:
48
+ # Server returns status 500 when there are no more rows.
49
+ # raise an error if server returns an error
50
+ response.raise_for_status()
51
+ meta = response.json()["meta"]
52
+ return parse_qs(meta["next"][1:])
53
+ except Exception as ex:
54
+ return ex
55
+
56
+ def request_params(
57
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
58
+ ) -> MutableMapping[str, Any]:
59
+
60
+ params = {"format": "json"}
61
+ return params
62
+
63
+
64
+ class Application(CommcareStream):
65
+ primary_key = "id"
66
+
67
+ def __init__(self, app_id, **kwargs):
68
+ super().__init__(**kwargs)
69
+ self.app_id = app_id
70
+
71
+ def path(
72
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
73
+ ) -> str:
74
+ return f"application/{self.app_id}/"
75
+
76
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
77
+ return None
78
+
79
+ def request_params(
80
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
81
+ ) -> MutableMapping[str, Any]:
82
+
83
+ params = {"format": "json", "extras": "true"}
84
+ return params
85
+
86
+ def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:
87
+ yield response.json()
88
+
89
+
90
+ class IncrementalStream(CommcareStream, IncrementalMixin):
91
+ cursor_field = "indexed_on"
92
+ _cursor_value = None
93
+
94
+ @property
95
+ def state(self) -> Mapping[str, Any]:
96
+ if self._cursor_value:
97
+ return {self.cursor_field: self._cursor_value}
98
+
99
+ @state.setter
100
+ def state(self, value: Mapping[str, Any]):
101
+ self._cursor_value = datetime.strptime(value[self.cursor_field], self.dateformat)
102
+
103
+ @property
104
+ def sync_mode(self):
105
+ return SyncMode.incremental
106
+
107
+ @property
108
+ def supported_sync_modes(self):
109
+ return [SyncMode.incremental]
110
+
111
+ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
112
+ try:
113
+ # Server returns status 500 when there are no more rows.
114
+ # raise an error if server returns an error
115
+ response.raise_for_status()
116
+ meta = response.json()["meta"]
117
+ if meta["next"]:
118
+ return parse_qs(meta["next"][1:])
119
+ return None
120
+ except Exception:
121
+ return None
122
+
123
+ def request_params(
124
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
125
+ ) -> MutableMapping[str, Any]:
126
+
127
+ params = {"format": "json"}
128
+ if next_page_token:
129
+ params.update(next_page_token)
130
+ return params
131
+
132
+ def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:
133
+ for o in iter(response.json()["objects"]):
134
+ yield o
135
+ return None
136
+
137
+
138
+ class Case(IncrementalStream):
139
+
140
+ """
141
+ docs: https://www.commcarehq.org/a/[domain]/api/[version]/case/
142
+ """
143
+
144
+ cursor_field = "indexed_on"
145
+ primary_key = "id"
146
+
147
+ def __init__(self, start_date, app_id, schema, **kwargs):
148
+ super().__init__(**kwargs)
149
+ self._cursor_value = datetime.strptime(start_date, "%Y-%m-%dT%H:%M:%SZ")
150
+ self.schema = schema
151
+
152
+ def get_json_schema(self):
153
+ return self.schema
154
+
155
+ @property
156
+ def name(self):
157
+ # Airbyte orders streams in alpha order but since we have dependent peers and we need to
158
+ # pull all forms before cases, we name this stream to
159
+ # ensure this stream gets pulled last (assuming ascii stream names only)
160
+ return "zzz_case"
161
+
162
+ def path(
163
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
164
+ ) -> str:
165
+ return "case"
166
+
167
+ def request_params(
168
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
169
+ ) -> MutableMapping[str, Any]:
170
+
171
+ # start date is what we saved for forms
172
+ # if self.cursor_field in self.state else (CommcareStream.last_form_date or self.initial_date)
173
+ ix = self.state[self.cursor_field]
174
+ params = {"format": "json", "indexed_on_start": ix.strftime(self.dateformat), "order_by": "indexed_on", "limit": "5000"}
175
+ if next_page_token:
176
+ params.update(next_page_token)
177
+ return params
178
+
179
+ def read_records(self, *args, **kwargs) -> Iterable[Mapping[str, Any]]:
180
+ for record in super().read_records(*args, **kwargs):
181
+ found = False
182
+ for f in record["xform_ids"]:
183
+ if f in CommcareStream.forms:
184
+ found = True
185
+ break
186
+ if found:
187
+ self._cursor_value = datetime.strptime(record[self.cursor_field], self.dateformat)
188
+ # Make indexed_on tz aware
189
+ record.update({"streamname": "case", "indexed_on": record["indexed_on"] + "Z"})
190
+ # convert xform_ids field from array to comma separated list so flattening won't create
191
+ # one field per item. This is because some cases have up to 2000 xform_ids and we don't want 2000 extra
192
+ # fields in the schema
193
+ record["xform_ids"] = ",".join(record["xform_ids"])
194
+ frec = flatten(record)
195
+ yield frec
196
+ if self._cursor_value.microsecond == 0:
197
+ # Airbyte converts the cursor_field value (datetime) to string when it saves the state and
198
+ # our state setter parses the saved state with a format that contains microseconds
199
+ # self._cursor_value must have non-zero microseconds for the formatting and parsing to work correctly.
200
+ # This issue would also occur if an incoming record had a timestamp with zero microseconds
201
+ self._cursor_value = self._cursor_value.replace(microsecond=10)
202
+ # This cycle of pull is complete so clear out the form ids we saved for this cycle
203
+ CommcareStream.forms.clear()
204
+
205
+
206
+ class Form(IncrementalStream):
207
+ """
208
+ docs: https://www.commcarehq.org/a/[domain]/api/[version]/form/
209
+ """
210
+
211
+ cursor_field = "indexed_on"
212
+ primary_key = "id"
213
+
214
+ def __init__(self, start_date, app_id, name, xmlns, schema, **kwargs):
215
+ super().__init__(**kwargs)
216
+ self.app_id = app_id
217
+ self._cursor_value = datetime.strptime(start_date, "%Y-%m-%dT%H:%M:%SZ")
218
+ self.streamname = name
219
+ self.xmlns = xmlns
220
+ self.schema = schema
221
+
222
+ @property
223
+ def name(self):
224
+ return self.streamname
225
+
226
+ def get_json_schema(self):
227
+ return self.schema
228
+
229
+ def path(
230
+ self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
231
+ ) -> str:
232
+ return "form"
233
+
234
+ def request_params(
235
+ self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
236
+ ) -> MutableMapping[str, Any]:
237
+
238
+ # if self.cursor_field in self.state else self.initial_date
239
+ ix = self.state[self.cursor_field]
240
+ params = {
241
+ "format": "json",
242
+ "app_id": self.app_id,
243
+ "indexed_on_start": ix.strftime(self.dateformat),
244
+ "order_by": "indexed_on",
245
+ "limit": "1000",
246
+ "xmlns": self.xmlns,
247
+ }
248
+ if next_page_token:
249
+ params.update(next_page_token)
250
+ return params
251
+
252
+ def read_records(self, *args, **kwargs) -> Iterable[Mapping[str, Any]]:
253
+ upd = {"streamname": self.streamname, "xmlns": self.xmlns}
254
+ for record in super().read_records(*args, **kwargs):
255
+ self._cursor_value = datetime.strptime(record[self.cursor_field], self.dateformat)
256
+ CommcareStream.forms.add(record["id"])
257
+ form = record["form"]
258
+ form.update(upd)
259
+ # Append Z to make it timezone aware
260
+ form.update({"id": record["id"], "indexed_on": record["indexed_on"] + "Z"})
261
+ newform = self.scrubUnwantedFields(form)
262
+ yield flatten(newform)
263
+ if self._cursor_value.microsecond == 0:
264
+ # Airbyte converts the cursor_field value (datetime) to string when it saves the state and
265
+ # our state setter parses the saved state with a format that contains microseconds
266
+ # self._cursor_value must have non-zero microseconds for the formatting and parsing to work correctly.
267
+ # This issue would also occur if an incoming record had a timestamp with zero microseconds
268
+ self._cursor_value = self._cursor_value.replace(microsecond=10)
269
+
270
+
271
+ # Source
272
+ class SourceCommcare(AbstractSource):
273
+ def check_connection(self, logger, config) -> Tuple[bool, any]:
274
+ if "api_key" not in config:
275
+ return False, None
276
+ return True, None
277
+
278
+ def base_schema(self):
279
+ return {
280
+ "$schema": "http://json-schema.org/draft-07/schema#",
281
+ "type": "object",
282
+ "properties": {"id": {"type": "string"}, "indexed_on": {"type": "string", "format": "date-time"}},
283
+ }
284
+
285
+ def streams(self, config: Mapping[str, Any]) -> List[Stream]:
286
+ auth = TokenAuthenticator(config["api_key"], auth_method="ApiKey")
287
+ args = {
288
+ "authenticator": auth,
289
+ }
290
+ appdata = Application(**{**args, "app_id": config["app_id"], "project_space": config["project_space"]}).read_records(
291
+ sync_mode=SyncMode.full_refresh
292
+ )
293
+
294
+ # Generate streams for forms, one per xmlns and one stream for cases.
295
+ streams = self.generate_streams(args, config, appdata)
296
+ return streams
297
+
298
+ def generate_streams(self, args, config, appdata):
299
+ form_args = {"app_id": config["app_id"], "start_date": config["start_date"], "project_space": config["project_space"], **args}
300
+ streams = []
301
+ name2xmlns = {}
302
+
303
+ # Collect the form names and xmlns from the application
304
+ for record in appdata:
305
+ mods = record["modules"]
306
+ for m in mods:
307
+ forms = m["forms"]
308
+ for f in forms:
309
+ xmlns = f["xmlns"]
310
+ formname = ""
311
+ if "en" in f["name"]:
312
+ formname = f["name"]["en"].strip()
313
+ else:
314
+ # Unknown forms are named UNNAMED_xxxxx where xxxxx are the last 5 difits of the XMLNS
315
+ # This convention gives us repeatable names
316
+ formname = f"Unnamed_{xmlns[-5:]}"
317
+
318
+ name = formname
319
+ name2xmlns[name] = xmlns
320
+
321
+ # Create the streams from the collected names
322
+ # Sorted by name
323
+ for k in sorted(name2xmlns):
324
+ key = name2xmlns[k]
325
+ stream = Form(name=k, xmlns=key, schema=self.base_schema(), **form_args)
326
+ streams.append(stream)
327
+
328
+ stream = Case(
329
+ app_id=config["app_id"],
330
+ start_date=config["start_date"],
331
+ schema=self.base_schema(),
332
+ project_space=config["project_space"],
333
+ **args,
334
+ )
335
+ streams.append(stream)
336
+
337
+ return streams
@@ -0,0 +1,38 @@
1
+ documentationUrl: https://docsurl.com
2
+ connectionSpecification:
3
+ $schema: http://json-schema.org/draft-07/schema#
4
+ title: Commcare Source Spec
5
+ type: object
6
+ required:
7
+ - api_key
8
+ - app_id
9
+ - start_date
10
+ properties:
11
+ api_key:
12
+ type: string
13
+ title: API Key
14
+ description: >-
15
+ Commcare API Key
16
+ airbyte_secret: true
17
+ order: 0
18
+ project_space:
19
+ type: string
20
+ title: Project Space
21
+ description: >-
22
+ Project Space for commcare
23
+ order: 1
24
+ app_id:
25
+ type: string
26
+ title: Application ID
27
+ description: >-
28
+ The Application ID we are interested in
29
+ airbyte_secret: true
30
+ order: 2
31
+ start_date:
32
+ type: string
33
+ title: Start date for extracting records
34
+ pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$
35
+ default: "2022-10-01T00:00:00Z"
36
+ description: >-
37
+ UTC date and time in the format 2017-01-25T00:00:00Z. Only records after this date will be replicated.
38
+ order: 3
unit_tests/__init__.py ADDED
@@ -0,0 +1,3 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
@@ -0,0 +1,25 @@
1
+ #
2
+ # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
3
+ #
4
+
5
+ from unittest.mock import MagicMock, Mock
6
+
7
+ import pytest
8
+ from source_commcare.source import SourceCommcare
9
+
10
+
11
+ @pytest.fixture(name="config")
12
+ def config_fixture():
13
+ return {"api_key": "apikey", "app_id": "appid", "start_date": "2022-01-01T00:00:00Z"}
14
+
15
+
16
+ def test_check_connection_ok(mocker, config):
17
+ source = SourceCommcare()
18
+ logger_mock = Mock()
19
+ assert source.check_connection(logger_mock, config=config) == (True, None)
20
+
21
+
22
+ def test_check_connection_fail(mocker, config):
23
+ source = SourceCommcare()
24
+ logger_mock = MagicMock()
25
+ assert source.check_connection(logger_mock, config={}) == (False, None)