registerbekanntmachungen 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CHANGELOG.md +5 -0
- data/LICENSE +18 -0
- data/README.md +89 -0
- data/Rakefile +4 -0
- data/db/2024-09/registerbekanntmachungen-2024-09-27.json +2542 -0
- data/db/2024-09/registerbekanntmachungen-2024-09-28.json +367 -0
- data/db/2024-09/registerbekanntmachungen-2024-09-29.json +8 -0
- data/db/2024-09/registerbekanntmachungen-2024-09-30.json +2336 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-01.json +2911 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-02.json +2265 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-03.json +367 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-04.json +1318 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-05.json +260 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-06.json +9 -0
- data/db/2024-10/registerbekanntmachungen-2024-10-07.json +1663 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-08.json +1605 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-09.json +259 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-10.json +8 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-11.json +1282 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-12.json +1496 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-13.json +1471 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-14.json +2240 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-15.json +1328 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-16.json +8 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-17.json +8 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-18.json +1628 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-19.json +2136 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-20.json +1567 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-21.json +1902 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-22.json +1232 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-23.json +235 -0
- data/db/2024-11/registerbekanntmachungen-2024-11-24.json +32 -0
- data/lib/registerbekanntmachungen/parser.rb +147 -0
- data/lib/registerbekanntmachungen/version.rb +5 -0
- data/lib/registerbekanntmachungen.rb +368 -0
- data/sig/registerbekanntmachungen.rbs +4 -0
- metadata +139 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 7d6e8c077be1fcf802c1b2bb310ee128ccf4c3c08d38aece2a851ecce25c4f3a
|
4
|
+
data.tar.gz: 2c36e536f4ef72e412d50b0e0f3207117ad04c3eac02b584635643ca5fca36f2
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 2059587467850ca8e1166b49aff5c42526eeeec2cb7ed85a30528728976104a394a1575f6131f5ed2a87dc2447b79a0369f6d135a59839fa7997a3ccc37e89c1
|
7
|
+
data.tar.gz: 23f2429283f9bc19c5f68f37b913a3609cb5ad5d1db198cbe711d4bfb93297a05d8a3f97f3e0271262256292b8c2046a81795734811bc67b466775368aa9ed43
|
data/CHANGELOG.md
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
Copyright 2024 Christopher Özbek
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
4
|
+
this software and associated documentation files (the “Software”), to deal in
|
5
|
+
the Software without restriction, including without limitation the rights to
|
6
|
+
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
|
7
|
+
the Software, and to permit persons to whom the Software is furnished to do so,
|
8
|
+
subject to the following conditions:
|
9
|
+
|
10
|
+
The above copyright notice and this permission notice shall be included in all
|
11
|
+
copies or substantial portions of the Software.
|
12
|
+
|
13
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
14
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
|
15
|
+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
|
16
|
+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
|
17
|
+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
18
|
+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,89 @@
|
|
1
|
+
# Registerbekanntmachungen
|
2
|
+
|
3
|
+
[![License MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
|
4
|
+
[![Daily Scrape CI](https://github.com/coezbek/registerbekanntmachungen/actions/workflows/daily_scrape.yml/badge.svg)](https://github.com/coezbek/registerbekanntmachungen/actions/workflows/daily_scrape.yml)
|
5
|
+
|
6
|
+
> [!IMPORTANT]
|
7
|
+
> This project is not affiliated with handelsregister.de. It is not an official source of truth. This data may be incomplete or outdated. The tool might mangle or miss entries. Use at your own risk.
|
8
|
+
|
9
|
+
This Ruby Gem provides a simple scraper for the "Registerbekanntmachungen" (notices of changes in corporate directory) announced by the [German Handelsregister on their page](https://www.handelsregister.de/rp_web/xhtml/bekanntmachungen.xhtml).
|
10
|
+
|
11
|
+
These Registerbekanntmachungen only contain officially mandated publications such as those mandated by [Handelsgesetzbuch HGB §10](https://www.gesetze-im-internet.de/hgb/__10.html). It does not contain all changes in the corporate directory, only those that are required to be published (in particular declarations to delete i.e. 'Löschungen' and 'Umwandlungen').
|
12
|
+
|
13
|
+
Since the data is static once published, small (only <= 200 announcements per day) and only the last 8 weeks are available at handelregister.de, this repository also hosts [daily snapshots](./db) of the data sorted by month in the `db` directory. In most cases, running the scraper yourself it thus unnecessary.
|
14
|
+
|
15
|
+
> [!IMPORTANT]
|
16
|
+
> Handelsregister.de's Terms & Conditions mandates that users should not perform more than 60 'searches' or 'retrievals' per hour. Both terms are not defined it is thus not clear what constitutes either. It is unclear on what legal basis Handelsregister.de believes they have a right to impose such limits, since HGB makes no restrictions on an individuals ability to access these publically available information for the purpose of information. Since information on the handelsregister can be updated on a daily basis and there are no means of bulk 'retrievals', it could be argued that individuals must be able to perform as many accesses as are necessary to obtain all company information once per day. For the Registerbekanntmachungen this means on the order of 200 accesses per day per user.
|
17
|
+
> Handelsregister.de also claims that material is copyrighted unless otherwise noted. Since all material which is available for access is provided by the register courts to handelsregister.de to ensure retrieval, it is unclear to what material they really claim copyright and why.
|
18
|
+
|
19
|
+
## Related projects
|
20
|
+
|
21
|
+
https://github.com/bundesAPI/handelsregister provides a Python CLI for accessing the Handelsregister itself.
|
22
|
+
|
23
|
+
## Usage / Installation
|
24
|
+
|
25
|
+
To use the gem, best clone this repository:
|
26
|
+
|
27
|
+
```bash
|
28
|
+
git clone https://github.com/coezbek/registerbekanntmachungen
|
29
|
+
cd registerbekanntmachungen
|
30
|
+
bundle
|
31
|
+
```
|
32
|
+
|
33
|
+
Run the scraper in verbose mode for announcements published today:
|
34
|
+
|
35
|
+
```bash
|
36
|
+
bundle exec lib/registerbekanntmachungen.rb -v
|
37
|
+
```
|
38
|
+
|
39
|
+
All available option:
|
40
|
+
|
41
|
+
```bash
|
42
|
+
$ bundle exec lib/registerbekanntmachungen.rb --help
|
43
|
+
Usage: registerbekanntmachungen [options]
|
44
|
+
-v, --verbose Enable verbose/debug output
|
45
|
+
-r, --reload Reload data and skip cache
|
46
|
+
--no-save Do not save any data just print to stdout
|
47
|
+
--start-date DATE Start date in format DD.MM.YYYY or YYYY-MM-DD
|
48
|
+
--end-date DATE End date in format DD.MM.YYYY or YYYY-MM-DD
|
49
|
+
-o, --oldest Download oldest available data not already saved
|
50
|
+
-y, --yesterday Download yesterday's data
|
51
|
+
-a, --all Download all data from the last 8 weeks
|
52
|
+
--no-headless Don't run browser in headless mode
|
53
|
+
-h, --help Displays Help
|
54
|
+
```
|
55
|
+
|
56
|
+
*Note*:
|
57
|
+
- Data usually isn't published on weekends, but sometimes it is.
|
58
|
+
- If you run this tool in the morning you might want to use it again in the evening/night, because new announcements are published throughout the day.
|
59
|
+
- Data is only available for the last 8 weeks (7*8 = 56 days) on handelsregister.de. Use '-o' to obtain the data from the oldest available date.
|
60
|
+
|
61
|
+
## Data SchemaAnnouncement types
|
62
|
+
|
63
|
+
The data contains the following fields for each announcement:
|
64
|
+
|
65
|
+
- `date`: The date the announcement was published as an ISO 8601 string, e.g. "2024-10-01".
|
66
|
+
- `original_text`: The original link text of the announcement.
|
67
|
+
- `court`: The court where the announcement was published, e.g. "Amtsgericht Berlin (Charlottenburg)".
|
68
|
+
- `registernumber`: The register number of the company including the prefix (e.g. "HRB 12345" or "VR 123").
|
69
|
+
- `registerart`: The type of the register, e.g. "HRA" for Handelsregister A or "VR" for Vereinsregister.
|
70
|
+
- `company_name`: The name of the company, can include special characters.
|
71
|
+
- `type`: The type of the announcement. Possible values I have observed so far:
|
72
|
+
- Sonderregisterbekanntmachung OHNE Bezug zum elektr. Register
|
73
|
+
- Löschungsankündigung
|
74
|
+
- Sonstige Registerbekanntmachung
|
75
|
+
- Registerbekanntmachung nach dem Umwandlungsgesetz
|
76
|
+
- Einreichung neuer Dokumente
|
77
|
+
- `state`: The Federal State of Germany where the company is registered.
|
78
|
+
- `company_seat`: The city where the company is registered.
|
79
|
+
- `former_court`: In case the company registration was moved from another court, the name of the former court.
|
80
|
+
- `details`: The text detail message shown on the handelsregister.de website.
|
81
|
+
|
82
|
+
If an announcement is of type "Sonderregisterbekanntmachung OHNE Bezug zum elektr. Register", the available fields are different:
|
83
|
+
|
84
|
+
- `date`, `state`, `court`, `company_name`, `type`, `details`, `original_text` are equally available
|
85
|
+
- Instead of `registernumber` there is a `sonderegister_referenz`, which might be a registernumber but could be something else.
|
86
|
+
|
87
|
+
## Contributing
|
88
|
+
|
89
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/coezbek/registerbekanntmachungen.
|
data/Rakefile
ADDED