data_porter 2.0.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +21 -0
  3. data/README.md +7 -1
  4. data/ROADMAP.md +84 -74
  5. data/app/assets/javascripts/data_porter/progress_controller.js +3 -3
  6. data/app/controllers/data_porter/concerns/import_validation.rb +5 -5
  7. data/app/views/data_porter/imports/index.html.erb +23 -23
  8. data/app/views/data_porter/imports/new.html.erb +11 -11
  9. data/app/views/data_porter/imports/show.html.erb +19 -19
  10. data/app/views/data_porter/mapping_templates/_form.html.erb +10 -10
  11. data/app/views/data_porter/mapping_templates/edit.html.erb +2 -2
  12. data/app/views/data_porter/mapping_templates/index.html.erb +10 -10
  13. data/app/views/data_porter/mapping_templates/new.html.erb +2 -2
  14. data/config/locales/en.yml +123 -0
  15. data/config/locales/fr.yml +123 -0
  16. data/config/routes.rb +2 -2
  17. data/lib/data_porter/components/mapping/column_row.rb +1 -1
  18. data/lib/data_porter/components/mapping/form.rb +4 -4
  19. data/lib/data_porter/components/mapping/template_select.rb +1 -1
  20. data/lib/data_porter/components/preview/results_summary.rb +13 -5
  21. data/lib/data_porter/components/preview/summary_cards.rb +5 -4
  22. data/lib/data_porter/components/preview/table.rb +3 -3
  23. data/lib/data_porter/components/progress/bar.rb +9 -2
  24. data/lib/data_porter/components/shared/pagination.rb +9 -5
  25. data/lib/data_porter/components/shared/status_badge.rb +3 -1
  26. data/lib/data_porter/engine.rb +4 -0
  27. data/lib/data_porter/orchestrator/record_builder.rb +1 -1
  28. data/lib/data_porter/record_validator.rb +2 -2
  29. data/lib/data_porter/version.rb +1 -1
  30. data/lib/generators/data_porter/locale/locale_generator.rb +42 -0
  31. data/mkdocs.yml +98 -0
  32. metadata +6 -3
  33. data/bookmarklet.md +0 -217
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "rails/generators"
4
+
5
+ module DataPorter
6
+ module Generators
7
+ class LocaleGenerator < Rails::Generators::Base
8
+ source_root File.expand_path("../../../../config/locales", __dir__)
9
+
10
+ argument :locale, type: :string, default: "en"
11
+
12
+ def copy_locale_file
13
+ source = source_file
14
+ destination = "config/locales/data_porter.#{locale}.yml"
15
+
16
+ copy_file(source, destination)
17
+ gsub_file(destination, /^#{source_locale}:/, "#{locale}:") unless locale == source_locale
18
+ end
19
+
20
+ def show_instructions
21
+ say ""
22
+ say "DataPorter locale file created: config/locales/data_porter.#{locale}.yml", :green
23
+ say ""
24
+ say "Next steps:"
25
+ say " 1. Translate the values in the generated file"
26
+ say " 2. Set your default locale in config/application.rb:"
27
+ say " config.i18n.default_locale = :#{locale}"
28
+ say ""
29
+ end
30
+
31
+ private
32
+
33
+ def source_file
34
+ File.exist?(File.join(self.class.source_root, "#{locale}.yml")) ? "#{locale}.yml" : "en.yml"
35
+ end
36
+
37
+ def source_locale
38
+ source_file.delete_suffix(".yml")
39
+ end
40
+ end
41
+ end
42
+ end
data/mkdocs.yml ADDED
@@ -0,0 +1,98 @@
1
+ site_name: DataPorter
2
+ site_description: A mountable Rails engine for data import workflows
3
+ site_url: https://seryllns.github.io/data_porter
4
+ repo_url: https://github.com/SerylLns/data_porter
5
+ repo_name: SerylLns/data_porter
6
+ edit_uri: edit/main/docs/
7
+
8
+ theme:
9
+ name: material
10
+ palette:
11
+ - media: "(prefers-color-scheme: light)"
12
+ scheme: default
13
+ primary: indigo
14
+ accent: indigo
15
+ toggle:
16
+ icon: material/brightness-7
17
+ name: Switch to dark mode
18
+ - media: "(prefers-color-scheme: dark)"
19
+ scheme: slate
20
+ primary: indigo
21
+ accent: indigo
22
+ toggle:
23
+ icon: material/brightness-4
24
+ name: Switch to light mode
25
+ font:
26
+ text: Inter
27
+ code: JetBrains Mono
28
+ icon:
29
+ repo: fontawesome/brands/github
30
+ logo: material/database-import
31
+ features:
32
+ - navigation.sections
33
+ - navigation.expand
34
+ - navigation.top
35
+ - navigation.indexes
36
+ - navigation.footer
37
+ - search.highlight
38
+ - search.suggest
39
+ - content.code.copy
40
+ - content.code.annotate
41
+ - content.tabs.link
42
+ - toc.follow
43
+
44
+ plugins:
45
+ - search
46
+ - exclude:
47
+ glob:
48
+ - blog/*
49
+ - blog_part_2/*
50
+ - V1_PLAN.md
51
+ - dev_to.md
52
+ - reddit_post.md
53
+
54
+ markdown_extensions:
55
+ - admonition
56
+ - pymdownx.details
57
+ - pymdownx.superfences
58
+ - pymdownx.highlight:
59
+ anchor_linenums: true
60
+ line_spans: __span
61
+ pygments_lang_class: true
62
+ - pymdownx.inlinehilite
63
+ - pymdownx.tabbed:
64
+ alternate_style: true
65
+ - pymdownx.emoji:
66
+ emoji_index: !!python/name:material.extensions.emoji.twemoji
67
+ emoji_generator: !!python/name:material.extensions.emoji.to_svg
68
+ - pymdownx.snippets:
69
+ base_path: ["."]
70
+ - attr_list
71
+ - md_in_html
72
+ - tables
73
+ - toc:
74
+ permalink: true
75
+
76
+ extra:
77
+ social:
78
+ - icon: fontawesome/brands/github
79
+ link: https://github.com/SerylLns/data_porter
80
+ - icon: fontawesome/brands/dev
81
+ link: https://dev.to/seryllns
82
+ generator: false
83
+
84
+ extra_css:
85
+ - stylesheets/extra.css
86
+
87
+ nav:
88
+ - Home: index.md
89
+ - Getting Started: getting-started.md
90
+ - Reference:
91
+ - Configuration: CONFIGURATION.md
92
+ - Targets: TARGETS.md
93
+ - Sources: SOURCES.md
94
+ - Column Mapping: MAPPING.md
95
+ - Routes: routes.md
96
+ - Roadmap: ROADMAP.md
97
+ - Changelog: changelog.md
98
+ - Contributing: contributing.md
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: data_porter
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Seryl Lounis
@@ -139,7 +139,8 @@ files:
139
139
  - app/views/data_porter/mapping_templates/index.html.erb
140
140
  - app/views/data_porter/mapping_templates/new.html.erb
141
141
  - app/views/layouts/data_porter/application.html.erb
142
- - bookmarklet.md
142
+ - config/locales/en.yml
143
+ - config/locales/fr.yml
143
144
  - config/routes.rb
144
145
  - lib/data_porter.rb
145
146
  - lib/data_porter/broadcaster.rb
@@ -183,9 +184,11 @@ files:
183
184
  - lib/generators/data_porter/install/templates/create_data_porter_imports.rb.erb
184
185
  - lib/generators/data_porter/install/templates/create_data_porter_mapping_templates.rb.erb
185
186
  - lib/generators/data_porter/install/templates/initializer.rb
187
+ - lib/generators/data_porter/locale/locale_generator.rb
186
188
  - lib/generators/data_porter/target/target_generator.rb
187
189
  - lib/generators/data_porter/target/templates/target.rb.tt
188
190
  - lib/tasks/data_porter.rake
191
+ - mkdocs.yml
189
192
  - sig/data_porter.rbs
190
193
  homepage: https://github.com/SerylLns/data_porter
191
194
  licenses:
@@ -194,7 +197,7 @@ metadata:
194
197
  homepage_uri: https://github.com/SerylLns/data_porter
195
198
  source_code_uri: https://github.com/SerylLns/data_porter
196
199
  changelog_uri: https://github.com/SerylLns/data_porter/blob/main/CHANGELOG.md
197
- documentation_uri: https://github.com/SerylLns/data_porter#readme
200
+ documentation_uri: https://seryllns.github.io/data_porter/
198
201
  bug_tracker_uri: https://github.com/SerylLns/data_porter/issues
199
202
  rubygems_mcp_server_uri: https://rubygems.org/gems/data_porter
200
203
  rubygems_mfa_required: 'true'
data/bookmarklet.md DELETED
@@ -1,217 +0,0 @@
1
- ---
2
- title: "Building a Product Clipper Bookmarklet with Shadow DOM and Structured Data"
3
- published: false
4
- tags: javascript, webdev, architecture, bookmarklet
5
- ---
6
-
7
- # Building a Product Clipper Bookmarklet with Shadow DOM and Structured Data
8
-
9
- We’re in 2008.
10
-
11
- The iPhone 3G just dropped. Facebook crosses 100 million users. Bitcoin quietly appears on a cryptography mailing list. The web is shifting.
12
-
13
- And while the world is obsessing over apps and platforms… we’re going back to something beautifully simple.
14
-
15
- **A bookmarklet.**
16
-
17
- No extension store review.
18
- No packaging.
19
- No deployment delays.
20
-
21
- Just a small piece of JavaScript living inside a browser bookmark — capable of injecting a clean, isolated UI into any e-commerce page, detecting product data automatically, and sending it to your backend.
22
-
23
- One click. Any product page. Instant extraction.
24
-
25
- Here’s how the architecture works.
26
-
27
- ---
28
-
29
- ## High-Level Architecture
30
-
31
- The clipper follows a simple execution model:
32
-
33
- 1. The bookmarklet injects a remote script into the current page.
34
- 2. The script scans the DOM using multiple detection strategies.
35
- 3. A sidebar panel renders inside a Shadow DOM (fully isolated).
36
- 4. Detected products are visually highlighted.
37
- 5. The user selects items to import.
38
- 6. Selected data is sent to the backend for processing.
39
-
40
- No browser extension. No build complexity. Just runtime execution.
41
-
42
- ---
43
-
44
- ## Why a Bookmarklet?
45
-
46
- For a user-triggered action ("Import this page"), a bookmarklet offers strong trade-offs:
47
-
48
- | | Bookmarklet | Browser Extension |
49
- | ----------- | --------------------- | -------------------------------- |
50
- | Install | Drag a link | Store review required |
51
- | Updates | Instant (server-side) | Requires store re-approval |
52
- | Permissions | None | Explicit permission prompts |
53
- | Maintenance | Single hosted file | Multi-file manifest architecture |
54
-
55
- If your tool runs only when explicitly triggered, a bookmarklet is often the leanest solution.
56
-
57
- ---
58
-
59
- ## Product Detection Strategy
60
-
61
- No single detection method works across all e-commerce sites.
62
-
63
- A robust clipper layers multiple strategies, ordered by confidence:
64
-
65
- - **Structured Data (JSON-LD)**
66
- Many sites expose `schema.org/Product` data for SEO.
67
- - **Microdata attributes**
68
- - **OpenGraph metadata**
69
- - **Heuristic DOM scanning**
70
- - **URL pattern matching (fallback)**
71
-
72
- The key principle is:
73
-
74
- > Prefer high-confidence structured data, then gracefully degrade.
75
-
76
- ### Example: Extracting JSON-LD Products
77
-
78
- ```javascript
79
- function extractFromJsonLd() {
80
- const scripts = document.querySelectorAll(
81
- 'script[type="application/ld+json"]',
82
- );
83
- const products = [];
84
-
85
- scripts.forEach((script) => {
86
- try {
87
- const data = JSON.parse(script.textContent);
88
- // Traverse recursively and collect Product objects
89
- collectProducts(data, products);
90
- } catch (e) {}
91
- });
92
-
93
- return products;
94
- }
95
- ```
96
-
97
- In practice, you normalize URLs, merge duplicates, and score sources by confidence before presenting results.
98
-
99
- The goal is reliability, not perfection.
100
-
101
- ---
102
-
103
- ## Shadow DOM: Isolation Is Non-Negotiable
104
-
105
- Injecting UI into arbitrary websites is dangerous.
106
-
107
- CSS resets, `!important` rules, framework styles — they will break your interface.
108
-
109
- Shadow DOM solves this by creating an isolated rendering tree:
110
-
111
- ```javascript
112
- const host = document.createElement("div");
113
- document.body.appendChild(host);
114
-
115
- const shadow = host.attachShadow({ mode: "open" });
116
- shadow.innerHTML = `
117
- <style>
118
- :host { all: initial; font-family: system-ui; }
119
- .panel { position: fixed; right: 0; top: 0; }
120
- </style>
121
- <div class="panel">Clipper UI</div>
122
- `;
123
- ```
124
-
125
- Key principle:
126
-
127
- > Your UI must behave identically on Shopify, Magento, custom React apps, or legacy PHP pages.
128
-
129
- Isolation is mandatory.
130
-
131
- ---
132
-
133
- ## Visual Feedback & Interaction Control
134
-
135
- When products are detected, highlighting them directly on the page improves user confidence.
136
-
137
- Because many e-commerce sites attach their own click handlers (analytics, routing, SPA navigation), event handling must be carefully managed.
138
-
139
- Best practice:
140
-
141
- - Use capture phase listeners
142
- - Prevent unintended navigation
143
- - Clean up all listeners and styles on teardown
144
-
145
- A clipper should leave **zero traces** after closing.
146
-
147
- ---
148
-
149
- ## Backend Communication
150
-
151
- Once items are selected, the clipper sends a structured payload to your backend.
152
-
153
- The backend typically:
154
-
155
- - Normalizes URLs
156
- - Deduplicates products
157
- - Associates them with a source domain
158
- - Triggers downstream processing (price tracking, enrichment, etc.)
159
-
160
- Security considerations:
161
-
162
- - Use scoped, short-lived API tokens
163
- - Never expose sensitive credentials
164
- - Sanitize all extracted DOM content before rendering
165
-
166
- ---
167
-
168
- ## Limitations
169
-
170
- A bookmarklet runs inside the page’s context. That comes with constraints.
171
-
172
- ### Content Security Policy (CSP)
173
-
174
- Strict `script-src` headers can block injected scripts entirely.
175
- There is no client-side workaround. A browser extension is required in those cases.
176
-
177
- ### Single Page Applications (SPAs)
178
-
179
- React/Next.js apps often load content asynchronously.
180
- Mutation observers or delayed scans improve detection reliability.
181
-
182
- ### Bot Protection (Backend)
183
-
184
- While the bookmarklet runs in the user’s real browser session, backend scraping of submitted URLs may face anti-bot systems.
185
- That is a server-side concern.
186
-
187
- ---
188
-
189
- ## Legal & Ethical Considerations
190
-
191
- If you build a commercial tool around product extraction:
192
-
193
- - Only collect publicly visible data
194
- - Do not bypass authentication or CAPTCHAs
195
- - Respect rate limits
196
- - Be transparent about data usage
197
- - Review relevant laws (CFAA, GDPR, local regulations)
198
-
199
- User-initiated clipping is typically lower risk than automated crawling, but not risk-free.
200
-
201
- ---
202
-
203
- ## Lessons Learned
204
-
205
- 1. **Isolation first.** Shadow DOM prevents 90% of UI conflicts.
206
- 2. **Layered detection beats single heuristics.**
207
- 3. **Keep it framework-free.** Dependencies increase fragility.
208
- 4. **Design for hostile environments.** You do not control the host page.
209
- 5. **Simplicity wins.** A single hosted file can outperform complex extension architectures.
210
-
211
- ---
212
-
213
- A bookmarklet is not flashy.
214
-
215
- But when designed correctly, it becomes a powerful bridge between arbitrary web pages and your product.
216
-
217
- Sometimes the most effective architecture is the one that avoids complexity entirely.