@adminide-stack/yantra-help-browser 12.0.16-alpha.27 → 12.0.16-alpha.29
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/lib/components/HelpCenterFooter.d.ts.map +1 -1
- package/lib/components/HelpCenterFooter.js +43 -88
- package/lib/components/HelpCenterFooter.js.map +1 -1
- package/lib/components/HelpCenterHeader.d.ts.map +1 -1
- package/lib/components/HelpCenterHeader.js +3 -8
- package/lib/components/HelpCenterHeader.js.map +1 -1
- package/lib/components/Icons.d.ts +55 -0
- package/lib/components/Icons.d.ts.map +1 -0
- package/lib/{pages/LandingPage/components → components}/Icons.js +39 -2
- package/lib/components/Icons.js.map +1 -0
- package/lib/components/Logo.d.ts +1 -2
- package/lib/components/Logo.d.ts.map +1 -1
- package/lib/components/Logo.js +2 -9
- package/lib/components/Logo.js.map +1 -1
- package/lib/components/PageHero.d.ts +9 -0
- package/lib/components/PageHero.d.ts.map +1 -0
- package/lib/components/PageHero.js +47 -0
- package/lib/components/PageHero.js.map +1 -0
- package/lib/components/SearchBar.d.ts.map +1 -1
- package/lib/components/SearchBar.js +9 -45
- package/lib/components/SearchBar.js.map +1 -1
- package/lib/components/SidebarSearch.js +2 -2
- package/lib/components/SidebarSearch.js.map +1 -1
- package/lib/components/navbar/index.d.ts.map +1 -1
- package/lib/components/navbar/index.js +5 -16
- package/lib/components/navbar/index.js.map +1 -1
- package/lib/compute.d.ts +15 -0
- package/lib/compute.d.ts.map +1 -1
- package/lib/compute.js +74 -2
- package/lib/compute.js.map +1 -1
- package/lib/pages/About/index.d.ts +3 -0
- package/lib/pages/About/index.d.ts.map +1 -0
- package/lib/pages/About/index.js +69 -0
- package/lib/pages/About/index.js.map +1 -0
- package/lib/pages/Careers/index.d.ts +3 -0
- package/lib/pages/Careers/index.d.ts.map +1 -0
- package/lib/pages/Careers/index.js +78 -0
- package/lib/pages/Careers/index.js.map +1 -0
- package/lib/pages/CategoryCollection/index.d.ts.map +1 -1
- package/lib/pages/CategoryCollection/index.js +15 -78
- package/lib/pages/CategoryCollection/index.js.map +1 -1
- package/lib/pages/Community/index.d.ts +3 -0
- package/lib/pages/Community/index.d.ts.map +1 -0
- package/lib/pages/Community/index.js +72 -0
- package/lib/pages/Community/index.js.map +1 -0
- package/lib/pages/Contact/index.d.ts +3 -0
- package/lib/pages/Contact/index.d.ts.map +1 -0
- package/lib/pages/Contact/index.js +128 -0
- package/lib/pages/Contact/index.js.map +1 -0
- package/lib/pages/GetStarted/components/ExampleCard.d.ts.map +1 -1
- package/lib/pages/GetStarted/index.d.ts.map +1 -1
- package/lib/pages/GetStarted/index.js +9 -45
- package/lib/pages/GetStarted/index.js.map +1 -1
- package/lib/pages/HelpCenter/components/HelpCategoryCard.d.ts.map +1 -1
- package/lib/pages/HelpCenter/components/HelpCategoryCard.js +7 -34
- package/lib/pages/HelpCenter/components/HelpCategoryCard.js.map +1 -1
- package/lib/pages/HelpCenter/components/PopularArticle.d.ts.map +1 -1
- package/lib/pages/HelpCenter/components/PopularArticle.js +3 -12
- package/lib/pages/HelpCenter/components/PopularArticle.js.map +1 -1
- package/lib/pages/HelpCenter/index.d.ts.map +1 -1
- package/lib/pages/HelpCenter/index.js +15 -74
- package/lib/pages/HelpCenter/index.js.map +1 -1
- package/lib/pages/LandingPage/components/ArticleCard.d.ts.map +1 -1
- package/lib/pages/LandingPage/components/ArticleCard.js +3 -12
- package/lib/pages/LandingPage/components/ArticleCard.js.map +1 -1
- package/lib/pages/LandingPage/components/CategoryCard.d.ts.map +1 -1
- package/lib/pages/LandingPage/components/CategoryCard.js +3 -12
- package/lib/pages/LandingPage/components/CategoryCard.js.map +1 -1
- package/lib/pages/LandingPage/components/HeroSection.d.ts.map +1 -1
- package/lib/pages/LandingPage/components/HeroSection.js +3 -8
- package/lib/pages/LandingPage/components/HeroSection.js.map +1 -1
- package/lib/pages/LandingPage/components/ResourceCard.d.ts.map +1 -1
- package/lib/pages/LandingPage/components/ResourceCard.js +3 -12
- package/lib/pages/LandingPage/components/ResourceCard.js.map +1 -1
- package/lib/pages/LandingPage/index.d.ts.map +1 -1
- package/lib/pages/LandingPage/index.js +62 -33
- package/lib/pages/LandingPage/index.js.map +1 -1
- package/lib/pages/Markdown/MarkdownPageLayout.d.ts.map +1 -1
- package/lib/pages/Markdown/MarkdownPageLayout.js +11 -56
- package/lib/pages/Markdown/MarkdownPageLayout.js.map +1 -1
- package/lib/pages/Privacy/index.d.ts +3 -0
- package/lib/pages/Privacy/index.d.ts.map +1 -0
- package/lib/pages/Privacy/index.js +449 -0
- package/lib/pages/Privacy/index.js.map +1 -0
- package/lib/pages/ReleaseNotes/index.d.ts +3 -0
- package/lib/pages/ReleaseNotes/index.d.ts.map +1 -0
- package/lib/pages/ReleaseNotes/index.js +61 -0
- package/lib/pages/ReleaseNotes/index.js.map +1 -0
- package/lib/pages/StatusPage/index.d.ts +3 -0
- package/lib/pages/StatusPage/index.d.ts.map +1 -0
- package/lib/pages/StatusPage/index.js +85 -0
- package/lib/pages/StatusPage/index.js.map +1 -0
- package/lib/pages/Terms/index.d.ts +3 -0
- package/lib/pages/Terms/index.d.ts.map +1 -0
- package/lib/pages/Terms/index.js +60 -0
- package/lib/pages/Terms/index.js.map +1 -0
- package/lib/routes.json +88 -0
- package/lib/templates/content/account-management/account-setup.md +56 -57
- package/lib/templates/content/account-management/delete-account.md +55 -58
- package/lib/templates/content/account-management/preferences.md +70 -55
- package/lib/templates/content/account-management/privacy-settings.md +67 -56
- package/lib/templates/content/account-management/profile-settings.md +56 -45
- package/lib/templates/content/browser-extension/browser-extension-overview.md +60 -0
- package/lib/templates/content/browser-extension/extension-security.md +126 -0
- package/lib/templates/content/browser-extension/getting-started-extension.md +96 -0
- package/lib/templates/content/browser-extension/how-it-works.md +112 -0
- package/lib/templates/content/browser-extension/troubleshooting-extension.md +138 -0
- package/lib/templates/content/browser-extension/use-cases-workflows.md +123 -0
- package/lib/templates/content/content-manifest.json +328 -0
- package/lib/templates/content/data-privacy/data-collection.md +55 -51
- package/lib/templates/content/data-privacy/privacy-policy.md +74 -57
- package/lib/templates/content/data-subject-privacy/data-access.md +59 -69
- package/lib/templates/content/data-subject-privacy/data-portability.md +67 -93
- package/lib/templates/content/data-subject-privacy/privacy-requests.md +66 -62
- package/lib/templates/content/file-uploads/file-upload-overview.md +69 -50
- package/lib/templates/content/getting-started/getting-started-guide.md +65 -51
- package/lib/templates/content/product-features/ai-models.md +62 -57
- package/lib/templates/content/product-features/collaboration-tools.md +66 -45
- package/lib/templates/content/product-features/conversation-features.md +59 -34
- package/lib/templates/content/product-features/export-features.md +65 -45
- package/lib/templates/content/product-features/follow-up-questions.md +61 -46
- package/lib/templates/content/product-features/real-time-search.md +63 -46
- package/lib/templates/content/product-features/saved-searches.md +53 -45
- package/lib/templates/content/product-features/search-features.md +61 -43
- package/lib/templates/content/product-features/search-history.md +64 -45
- package/lib/templates/content/product-features/source-citations.md +77 -44
- package/lib/templates/content/scope-api/api-overview.md +83 -46
- package/lib/templates/content/search-modes/deep-research.md +87 -46
- package/lib/templates/content/search-modes/labs-features.md +54 -56
- package/lib/templates/content/search-modes/pro-search.md +53 -45
- package/lib/templates/content/search-modes/regular-search.md +64 -46
- package/lib/templates/content/spaces-library/spaces-overview.md +68 -47
- package/lib/templates/content/student-hub/academic-research.md +85 -48
- package/lib/templates/content/student-hub/student-discounts.md +51 -57
- package/lib/templates/content/student-hub/student-overview.md +64 -45
- package/lib/templates/content/student-hub/study-tools.md +99 -45
- package/lib/templates/content/subscription-billing/billing-cycle.md +50 -48
- package/lib/templates/content/subscription-billing/billing-overview.md +55 -34
- package/lib/templates/content/subscription-billing/billing-support.md +52 -45
- package/lib/templates/content/subscription-billing/currency-support.md +36 -62
- package/lib/templates/content/subscription-billing/enterprise-pricing.md +52 -46
- package/lib/templates/content/subscription-billing/invoice-management.md +64 -45
- package/lib/templates/content/subscription-billing/payment-methods.md +53 -52
- package/lib/templates/content/subscription-billing/promotional-offers.md +58 -46
- package/lib/templates/content/subscription-billing/refund-policy.md +53 -58
- package/lib/templates/content/subscription-billing/student-discounts.md +61 -57
- package/lib/templates/content/subscription-billing/tax-information.md +64 -45
- package/lib/templates/content/technical-questions/ai-models-technical.md +60 -56
- package/lib/templates/content/technical-questions/api-technical.md +131 -56
- package/lib/templates/content/technical-questions/data-processing.md +74 -55
- package/lib/templates/content/technical-questions/database-architecture.md +91 -54
- package/lib/templates/content/technical-questions/infrastructure.md +104 -55
- package/lib/templates/content/technical-questions/performance-optimization.md +77 -55
- package/lib/templates/content/technical-questions/search-algorithms.md +78 -56
- package/lib/templates/content/technical-questions/technical-overview.md +76 -52
- package/lib/templates/content/threads/conversation-management.md +81 -45
- package/lib/templates/content/threads/threads-overview.md +69 -45
- package/lib/templates/content/troubleshooting/common-issues.md +91 -43
- package/lib/templates/content/what-is-yantra/getting-started-yantra.md +58 -57
- package/package.json +2 -2
- package/lib/pages/LandingPage/components/Icons.d.ts +0 -13
- package/lib/pages/LandingPage/components/Icons.d.ts.map +0 -1
- package/lib/pages/LandingPage/components/Icons.js.map +0 -1
|
@@ -1,83 +1,102 @@
|
|
|
1
|
-
# Data Processing
|
|
1
|
+
# Data Processing Pipeline
|
|
2
2
|
|
|
3
|
-
How Yantra
|
|
3
|
+
How Yantra ingests, transforms, analyzes, and stores your data — from raw input to search-ready content.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
---
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## Data Ingestion
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- **API Integration**: Third-party API data ingestion
|
|
11
|
-
- **User Input**: User-generated content processing
|
|
12
|
-
- **File Uploads**: Document and file processing
|
|
9
|
+
Yantra supports multiple ingestion pathways to accommodate diverse data sources:
|
|
13
10
|
|
|
14
|
-
|
|
11
|
+
| Source | Method | Details |
|
|
12
|
+
| ------------ | ----------------------- | ----------------------------------------------------------------------- |
|
|
13
|
+
| Web content | Automated crawling | Configurable crawl schedules, depth limits, and domain allowlists |
|
|
14
|
+
| APIs | REST/GraphQL connectors | Pre-built connectors for 50+ services (Slack, Notion, Confluence, etc.) |
|
|
15
|
+
| File uploads | Drag-and-drop or API | Supports PDF, DOCX, PPTX, CSV, Markdown, HTML, and plain text |
|
|
16
|
+
| Databases | Direct connectors | PostgreSQL, MySQL, MongoDB read-only connectors with incremental sync |
|
|
15
17
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
- **
|
|
19
|
-
- **
|
|
18
|
+
### Ingestion guarantees
|
|
19
|
+
|
|
20
|
+
- **Exactly-once processing** — Deduplication ensures no content is indexed twice.
|
|
21
|
+
- **Incremental updates** — Only new or modified content is re-processed, minimizing compute costs.
|
|
22
|
+
- **Schema validation** — Incoming data is validated against expected schemas before entering the pipeline.
|
|
23
|
+
|
|
24
|
+
---
|
|
20
25
|
|
|
21
26
|
## Processing Architecture
|
|
22
27
|
|
|
23
|
-
### Stream
|
|
28
|
+
### Stream processing
|
|
29
|
+
|
|
30
|
+
Yantra uses an event-driven architecture built on **Apache Kafka** for real-time data flow:
|
|
31
|
+
|
|
32
|
+
1. **Producers** publish raw content events to topic partitions.
|
|
33
|
+
2. **Stream processors** (Kafka Streams) consume events, apply transformations, and emit enriched records.
|
|
34
|
+
3. **Consumers** write processed data to the appropriate storage layer (PostgreSQL, Elasticsearch, Redis).
|
|
35
|
+
|
|
36
|
+
### Batch processing
|
|
24
37
|
|
|
25
|
-
|
|
26
|
-
- **Event Streaming**: Event-driven processing
|
|
27
|
-
- **Message Queues**: Asynchronous processing
|
|
28
|
-
- **Batch Processing**: Batch data processing
|
|
38
|
+
For large-volume imports or periodic re-indexing, Yantra runs batch jobs using distributed task queues:
|
|
29
39
|
|
|
30
|
-
|
|
40
|
+
- Jobs are broken into chunks of 1,000 records.
|
|
41
|
+
- Each chunk is processed in parallel across worker nodes.
|
|
42
|
+
- Progress is tracked in real-time and visible in the Admin Dashboard.
|
|
31
43
|
|
|
32
|
-
|
|
33
|
-
- **Data Mapping**: Data field mapping
|
|
34
|
-
- **Format Conversion**: Data format conversion
|
|
35
|
-
- **Schema Evolution**: Schema management
|
|
44
|
+
---
|
|
36
45
|
|
|
37
46
|
## Content Analysis
|
|
38
47
|
|
|
39
|
-
|
|
48
|
+
Every piece of content passes through a multi-stage analysis pipeline:
|
|
40
49
|
|
|
41
|
-
|
|
42
|
-
- **Sentiment Analysis**: Sentiment detection
|
|
43
|
-
- **Topic Modeling**: Topic extraction
|
|
44
|
-
- **Entity Recognition**: Named entity recognition
|
|
50
|
+
### Text processing
|
|
45
51
|
|
|
46
|
-
|
|
52
|
+
- **Language detection** — Automatically identifies 50+ languages.
|
|
53
|
+
- **Tokenization** — Text is split into meaningful tokens using language-specific tokenizers.
|
|
54
|
+
- **Normalization** — Unicode normalization, lowercasing, and stop-word removal.
|
|
47
55
|
|
|
48
|
-
|
|
49
|
-
- **Quality Assessment**: Content quality scoring
|
|
50
|
-
- **Relevance Scoring**: Relevance assessment
|
|
51
|
-
- **Duplicate Detection**: Duplicate content detection
|
|
56
|
+
### Semantic analysis
|
|
52
57
|
|
|
53
|
-
|
|
58
|
+
- **Embedding generation** — Each document is converted to a high-dimensional vector using state-of-the-art embedding models.
|
|
59
|
+
- **Topic modeling** — Latent topics are extracted using LDA and transformer-based approaches.
|
|
60
|
+
- **Entity recognition** — People, organizations, dates, locations, and custom entities are identified and tagged.
|
|
61
|
+
- **Sentiment analysis** — Content sentiment (positive, negative, neutral) is scored for analytics.
|
|
62
|
+
|
|
63
|
+
### Quality scoring
|
|
64
|
+
|
|
65
|
+
| Signal | Weight | Description |
|
|
66
|
+
| ---------------- | ------ | ------------------------------------------------ |
|
|
67
|
+
| Completeness | 25% | Does the content cover its topic thoroughly? |
|
|
68
|
+
| Freshness | 20% | How recently was the content created or updated? |
|
|
69
|
+
| Source authority | 30% | How trustworthy is the source domain? |
|
|
70
|
+
| Readability | 15% | Flesch-Kincaid score and structural quality |
|
|
71
|
+
| Uniqueness | 10% | Duplicate and near-duplicate detection |
|
|
54
72
|
|
|
55
|
-
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## Data Storage
|
|
56
76
|
|
|
57
|
-
-
|
|
58
|
-
- **Search Storage**: Elasticsearch index
|
|
59
|
-
- **Cache Storage**: Redis cache
|
|
60
|
-
- **File Storage**: Object storage system
|
|
77
|
+
### Multi-tier storage architecture
|
|
61
78
|
|
|
62
|
-
|
|
79
|
+
| Tier | Technology | Purpose | Retention |
|
|
80
|
+
| ---- | ---------------------------- | ------------------------------------------ | -------------------- |
|
|
81
|
+
| Hot | PostgreSQL + Redis | Active queries, user data, real-time cache | Indefinite |
|
|
82
|
+
| Warm | Elasticsearch | Full-text search index, vector search | Indefinite |
|
|
83
|
+
| Cold | S3-compatible object storage | Archived content, raw uploads, backups | Per retention policy |
|
|
63
84
|
|
|
64
|
-
|
|
65
|
-
- **Data Retention**: Data retention policies
|
|
66
|
-
- **Data Archival**: Data archival processes
|
|
67
|
-
- **Data Deletion**: Secure data deletion
|
|
85
|
+
### Data lifecycle
|
|
68
86
|
|
|
69
|
-
|
|
87
|
+
1. **Ingest** — Raw data lands in the processing queue.
|
|
88
|
+
2. **Process** — Content is analyzed, enriched, and indexed.
|
|
89
|
+
3. **Serve** — Processed data is available for search and retrieval.
|
|
90
|
+
4. **Archive** — Older content is compressed and moved to cold storage based on access patterns.
|
|
91
|
+
5. **Delete** — Data past its retention period is securely purged (overwritten + cryptographic erasure).
|
|
70
92
|
|
|
71
|
-
|
|
93
|
+
---
|
|
72
94
|
|
|
73
|
-
|
|
74
|
-
- **Distributed Processing**: Distributed computing
|
|
75
|
-
- **Caching**: Intelligent caching strategies
|
|
76
|
-
- **Optimization**: Performance optimization
|
|
95
|
+
## Performance & Scalability
|
|
77
96
|
|
|
78
|
-
|
|
97
|
+
- **Throughput** — The pipeline processes 10,000+ documents per minute at steady state.
|
|
98
|
+
- **Latency** — End-to-end ingestion-to-searchable time is under 30 seconds for real-time sources.
|
|
99
|
+
- **Horizontal scaling** — Worker nodes auto-scale based on queue depth. During peak loads, the system scales from 4 to 32 workers automatically.
|
|
100
|
+
- **Backpressure handling** — If downstream systems slow down, the pipeline applies backpressure to producers rather than dropping data.
|
|
79
101
|
|
|
80
|
-
|
|
81
|
-
- **Load Distribution**: Load balancing
|
|
82
|
-
- **Resource Management**: Resource optimization
|
|
83
|
-
- **Auto-scaling**: Automatic scaling
|
|
102
|
+
> **Enterprise customers** can configure custom processing rules, retention policies, and data routing via the Admin Dashboard.
|
|
@@ -1,83 +1,120 @@
|
|
|
1
1
|
# Database Architecture
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
How Yantra stores, indexes, and retrieves data across multiple database systems optimized for different access patterns.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
---
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## Database Design Philosophy
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- **Schema Design**: Optimized database schema
|
|
11
|
-
- **Indexing Strategy**: Comprehensive indexing
|
|
12
|
-
- **Partitioning**: Database partitioning
|
|
9
|
+
Yantra follows a **polyglot persistence** strategy — each data type is stored in the database engine best suited for its access pattern:
|
|
13
10
|
|
|
14
|
-
|
|
11
|
+
| Data type | Engine | Why |
|
|
12
|
+
| ------------------------------ | --------------------- | ------------------------------------------------------- |
|
|
13
|
+
| User accounts, billing, config | PostgreSQL 16 | ACID transactions, relational integrity, mature tooling |
|
|
14
|
+
| Full-text + vector search | Elasticsearch 8.x | Inverted indexes, BM25 ranking, kNN vector search |
|
|
15
|
+
| Sessions, cache, rate limits | Redis 7 | Sub-millisecond reads, TTL support, pub/sub |
|
|
16
|
+
| File uploads, backups | S3-compatible storage | Virtually unlimited capacity, 11 nines durability |
|
|
15
17
|
|
|
16
|
-
|
|
17
|
-
- **Content Data**: Search content and metadata
|
|
18
|
-
- **Usage Data**: User interaction and analytics data
|
|
19
|
-
- **System Data**: System configuration and logs
|
|
18
|
+
---
|
|
20
19
|
|
|
21
|
-
##
|
|
20
|
+
## PostgreSQL — Primary Database
|
|
22
21
|
|
|
23
|
-
###
|
|
22
|
+
### Schema design principles
|
|
24
23
|
|
|
25
|
-
- **
|
|
26
|
-
- **
|
|
27
|
-
- **
|
|
28
|
-
- **
|
|
24
|
+
- **Normalized core tables** — Users, organizations, subscriptions, and permissions follow 3NF to avoid data anomalies.
|
|
25
|
+
- **JSONB for flexibility** — Metadata, user preferences, and integration configs use JSONB columns, combining schema flexibility with indexing support.
|
|
26
|
+
- **Timestamped everything** — Every table includes `created_at` and `updated_at` columns with timezone-aware timestamps.
|
|
27
|
+
- **Soft deletes** — Records are marked as deleted rather than physically removed, enabling audit trails and data recovery.
|
|
29
28
|
|
|
30
|
-
###
|
|
29
|
+
### Indexing strategy
|
|
31
30
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
-
|
|
35
|
-
|
|
31
|
+
| Index type | Use case | Example |
|
|
32
|
+
| ---------- | -------------------------- | -------------------------------------------------- |
|
|
33
|
+
| B-tree | Equality and range queries | `WHERE created_at > '2026-01-01'` |
|
|
34
|
+
| GIN | JSONB containment queries | `WHERE metadata @> '{"type": "pdf"}'` |
|
|
35
|
+
| Partial | Hot data subsets | `WHERE status = 'active'` (index only active rows) |
|
|
36
|
+
| Covering | Avoid table lookups | Include all `SELECT` columns in the index |
|
|
36
37
|
|
|
37
|
-
|
|
38
|
+
### High availability
|
|
38
39
|
|
|
39
|
-
|
|
40
|
+
- **Streaming replication** — One synchronous standby + two async replicas.
|
|
41
|
+
- **Automatic failover** — Patroni manages leader election; failover completes in < 10 seconds.
|
|
42
|
+
- **Point-in-time recovery** — WAL archiving enables recovery to any second within the retention window (30 days).
|
|
40
43
|
|
|
41
|
-
|
|
42
|
-
- **Index Optimization**: Database index optimization
|
|
43
|
-
- **Execution Plans**: Query execution optimization
|
|
44
|
-
- **Connection Pooling**: Database connection pooling
|
|
44
|
+
---
|
|
45
45
|
|
|
46
|
-
|
|
46
|
+
## Elasticsearch — Search Index
|
|
47
47
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
48
|
+
### Index architecture
|
|
49
|
+
|
|
50
|
+
Each content type has its own Elasticsearch index with optimized mappings:
|
|
51
|
+
|
|
52
|
+
- **Text fields** use `text` type with custom analyzers (language-specific stemming, synonym expansion).
|
|
53
|
+
- **Vector fields** use `dense_vector` type for kNN semantic search.
|
|
54
|
+
- **Keyword fields** for exact-match filtering (tags, content type, source).
|
|
55
|
+
- **Date fields** for time-range queries and recency boosting.
|
|
56
|
+
|
|
57
|
+
### Cluster topology
|
|
58
|
+
|
|
59
|
+
- **3 dedicated master nodes** for cluster coordination.
|
|
60
|
+
- **6+ data nodes** with SSD storage for search performance.
|
|
61
|
+
- **2 coordinating nodes** for query routing and result aggregation.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Redis — Cache & Real-Time
|
|
66
|
+
|
|
67
|
+
### Cache patterns
|
|
68
|
+
|
|
69
|
+
| Pattern | Use case | TTL |
|
|
70
|
+
| ------------------- | ---------------------------- | ----------------------- |
|
|
71
|
+
| Query result cache | Identical search queries | 5 minutes |
|
|
72
|
+
| Session store | User authentication sessions | 1 hour |
|
|
73
|
+
| Rate limit counters | API rate limiting | Rolling 1-minute window |
|
|
74
|
+
| Pub/Sub channels | Real-time notifications | N/A (ephemeral) |
|
|
75
|
+
|
|
76
|
+
### Memory management
|
|
77
|
+
|
|
78
|
+
- **Maxmemory policy** set to `allkeys-lru` — least recently used keys are evicted when memory limits are reached.
|
|
79
|
+
- **Key namespacing** — All keys are prefixed by service name to avoid collisions.
|
|
80
|
+
- **Cluster mode** — Redis Cluster with 6 nodes (3 primary + 3 replica) for horizontal scaling.
|
|
81
|
+
|
|
82
|
+
---
|
|
52
83
|
|
|
53
84
|
## Data Management
|
|
54
85
|
|
|
55
|
-
###
|
|
86
|
+
### Backup strategy
|
|
87
|
+
|
|
88
|
+
| What | Frequency | Retention | Method |
|
|
89
|
+
| ------------- | --------------------------- | ---------- | ------------------------------- |
|
|
90
|
+
| PostgreSQL | Continuous WAL + daily full | 30 days | pg_basebackup + WAL archiving |
|
|
91
|
+
| Elasticsearch | Daily snapshots | 14 days | Snapshot to S3 |
|
|
92
|
+
| Redis | RDB snapshots + AOF | 7 days | Automated via Redis persistence |
|
|
93
|
+
| File storage | Cross-region replication | Indefinite | S3 cross-region replication |
|
|
56
94
|
|
|
57
|
-
|
|
58
|
-
- **Data Processing**: Data transformation and processing
|
|
59
|
-
- **Data Storage**: Data storage and organization
|
|
60
|
-
- **Data Archival**: Data archival and cleanup
|
|
95
|
+
### Data governance
|
|
61
96
|
|
|
62
|
-
|
|
97
|
+
- **Encryption at rest** — AES-256 for all database storage volumes.
|
|
98
|
+
- **Encryption in transit** — TLS 1.3 for all database connections.
|
|
99
|
+
- **Access control** — Database credentials are rotated monthly via HashiCorp Vault.
|
|
100
|
+
- **Audit logging** — All schema changes and administrative queries are logged.
|
|
63
101
|
|
|
64
|
-
|
|
65
|
-
- **Data Cleaning**: Data cleaning and normalization
|
|
66
|
-
- **Data Monitoring**: Data quality monitoring
|
|
67
|
-
- **Data Governance**: Data governance policies
|
|
102
|
+
---
|
|
68
103
|
|
|
69
104
|
## Scalability
|
|
70
105
|
|
|
71
|
-
###
|
|
106
|
+
### Current capacity
|
|
72
107
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
108
|
+
| Metric | Value |
|
|
109
|
+
| --------------------------------- | ------ |
|
|
110
|
+
| Total indexed documents | 500M+ |
|
|
111
|
+
| Database size (PostgreSQL) | 2.4 TB |
|
|
112
|
+
| Search index size (Elasticsearch) | 8.7 TB |
|
|
113
|
+
| Peak queries per second | 12,000 |
|
|
114
|
+
| Average query latency | 45 ms |
|
|
77
115
|
|
|
78
|
-
###
|
|
116
|
+
### Scaling strategy
|
|
79
117
|
|
|
80
|
-
- **
|
|
81
|
-
- **
|
|
82
|
-
- **
|
|
83
|
-
- **Alerting**: Database performance alerting
|
|
118
|
+
- **Vertical** — Increase instance sizes for immediate capacity (database-level).
|
|
119
|
+
- **Horizontal** — Add read replicas (PostgreSQL), data nodes (Elasticsearch), or shard nodes (Redis) for linear scaling.
|
|
120
|
+
- **Partitioning** — Time-based partitioning for PostgreSQL tables with high write volume.
|
|
@@ -1,83 +1,132 @@
|
|
|
1
|
-
# Infrastructure
|
|
1
|
+
# Infrastructure & Reliability
|
|
2
2
|
|
|
3
|
-
Yantra's
|
|
3
|
+
Yantra's cloud infrastructure is designed for high availability, global performance, and security at every layer.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
---
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## Cloud Architecture
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- **Microservices**: Microservices architecture
|
|
11
|
-
- **Containerization**: Docker containerization
|
|
12
|
-
- **Orchestration**: Kubernetes orchestration
|
|
9
|
+
### Multi-region deployment
|
|
13
10
|
|
|
14
|
-
|
|
11
|
+
Yantra runs across **3 AWS regions** (US East, EU West, AP Southeast) with active-active configuration. User traffic is routed to the nearest region via latency-based DNS routing.
|
|
15
12
|
|
|
16
|
-
|
|
17
|
-
- **Storage**: Distributed storage systems
|
|
18
|
-
- **Networking**: High-performance networking
|
|
19
|
-
- **Security**: Infrastructure security
|
|
13
|
+
### Container orchestration
|
|
20
14
|
|
|
21
|
-
|
|
15
|
+
All services run as Docker containers orchestrated by **Kubernetes (EKS)**:
|
|
22
16
|
|
|
23
|
-
|
|
17
|
+
- **Namespaces** isolate production, staging, and development environments.
|
|
18
|
+
- **Resource quotas** prevent any single service from consuming excessive cluster resources.
|
|
19
|
+
- **Rolling deployments** ensure zero-downtime updates with automatic rollback on health check failures.
|
|
20
|
+
- **Horizontal Pod Autoscaler** adjusts replica counts based on CPU, memory, and custom metrics.
|
|
24
21
|
|
|
25
|
-
|
|
26
|
-
- **Load Balancing**: Advanced load balancing
|
|
27
|
-
- **Distributed Systems**: Distributed architecture
|
|
28
|
-
- **Resource Pooling**: Resource pooling strategies
|
|
22
|
+
### Service mesh
|
|
29
23
|
|
|
30
|
-
|
|
24
|
+
An Istio-based service mesh provides:
|
|
31
25
|
|
|
32
|
-
- **
|
|
33
|
-
- **
|
|
34
|
-
- **
|
|
35
|
-
|
|
26
|
+
- **Mutual TLS** between all services (zero-trust networking).
|
|
27
|
+
- **Traffic management** — Canary deployments, circuit breaking, retry policies.
|
|
28
|
+
- **Observability** — Distributed tracing with Jaeger, metrics with Prometheus.
|
|
29
|
+
|
|
30
|
+
---
|
|
36
31
|
|
|
37
32
|
## High Availability
|
|
38
33
|
|
|
39
|
-
### Availability
|
|
34
|
+
### Availability targets
|
|
35
|
+
|
|
36
|
+
| Component | Target SLA | Actual (trailing 12 months) |
|
|
37
|
+
| -------------- | ---------- | --------------------------- |
|
|
38
|
+
| API Gateway | 99.99% | 99.995% |
|
|
39
|
+
| Search Service | 99.95% | 99.98% |
|
|
40
|
+
| AI Service | 99.9% | 99.94% |
|
|
41
|
+
| Data Pipeline | 99.9% | 99.92% |
|
|
42
|
+
|
|
43
|
+
### Redundancy design
|
|
44
|
+
|
|
45
|
+
- **No single point of failure** — Every component has at least 2 replicas across different availability zones.
|
|
46
|
+
- **Database failover** — Automated failover with < 10-second recovery for PostgreSQL and Redis.
|
|
47
|
+
- **Cross-region replication** — Critical data is replicated across regions for disaster recovery.
|
|
48
|
+
- **Graceful degradation** — If the AI service is unavailable, search still returns results without AI-generated summaries.
|
|
49
|
+
|
|
50
|
+
### Disaster recovery
|
|
51
|
+
|
|
52
|
+
| Metric | Target | Actual |
|
|
53
|
+
| ------------------------------ | --------- | ---------- |
|
|
54
|
+
| Recovery Time Objective (RTO) | < 4 hours | 2.1 hours |
|
|
55
|
+
| Recovery Point Objective (RPO) | < 1 hour | 15 minutes |
|
|
56
|
+
| DR test frequency | Quarterly | Monthly |
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## Monitoring & Observability
|
|
61
|
+
|
|
62
|
+
### The three pillars
|
|
40
63
|
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
64
|
+
| Pillar | Tools | Details |
|
|
65
|
+
| ------- | -------------------------------- | ------------------------------------------------ |
|
|
66
|
+
| Metrics | Prometheus + Grafana | 2,000+ custom metrics, 15-second scrape interval |
|
|
67
|
+
| Logs | Fluentd + Elasticsearch + Kibana | Structured JSON logs, 30-day retention |
|
|
68
|
+
| Traces | Jaeger + OpenTelemetry | End-to-end request tracing across all services |
|
|
45
69
|
|
|
46
|
-
###
|
|
70
|
+
### Alerting
|
|
47
71
|
|
|
48
|
-
- **
|
|
49
|
-
- **
|
|
50
|
-
- **
|
|
51
|
-
- **
|
|
72
|
+
- **PagerDuty integration** for critical alerts (P1/P2) with automatic escalation.
|
|
73
|
+
- **Slack notifications** for warnings and informational alerts.
|
|
74
|
+
- **Anomaly detection** — ML-based alerting detects unusual patterns before they become incidents.
|
|
75
|
+
- **Runbooks** — Every alert links to a runbook with diagnosis steps and remediation procedures.
|
|
76
|
+
|
|
77
|
+
---
|
|
52
78
|
|
|
53
79
|
## Security Infrastructure
|
|
54
80
|
|
|
55
|
-
###
|
|
81
|
+
### Network security
|
|
82
|
+
|
|
83
|
+
- **VPC isolation** — All services run in private subnets with no direct internet access.
|
|
84
|
+
- **WAF (Web Application Firewall)** — Protects against OWASP Top 10 threats.
|
|
85
|
+
- **DDoS protection** — AWS Shield Advanced with automatic traffic scrubbing.
|
|
86
|
+
- **Egress filtering** — Outbound traffic is restricted to known-good destinations.
|
|
87
|
+
|
|
88
|
+
### Secrets management
|
|
89
|
+
|
|
90
|
+
- **HashiCorp Vault** for all secrets, API keys, and database credentials.
|
|
91
|
+
- **Automatic rotation** — Secrets are rotated on configurable schedules (default: 30 days).
|
|
92
|
+
- **Just-in-time access** — Engineers request temporary elevated access via an approval workflow.
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## CI/CD Pipeline
|
|
97
|
+
|
|
98
|
+
### Deployment flow
|
|
99
|
+
|
|
100
|
+
1. **Code push** — Developer pushes to a feature branch on GitHub.
|
|
101
|
+
2. **CI checks** — Automated linting, type checking, unit tests, and integration tests run in GitHub Actions.
|
|
102
|
+
3. **Build** — Docker images are built, scanned for vulnerabilities, and pushed to ECR.
|
|
103
|
+
4. **Staging deploy** — ArgoCD deploys to the staging environment automatically.
|
|
104
|
+
5. **QA validation** — Automated end-to-end tests + manual spot checks.
|
|
105
|
+
6. **Production deploy** — Canary deployment to 5% of traffic, then gradual rollout to 100%.
|
|
106
|
+
7. **Post-deploy monitoring** — Automated health checks verify error rates and latency for 30 minutes.
|
|
56
107
|
|
|
57
|
-
|
|
58
|
-
- **Application Security**: Application security
|
|
59
|
-
- **Data Security**: Data protection measures
|
|
60
|
-
- **Access Control**: Access control systems
|
|
108
|
+
### Deployment frequency
|
|
61
109
|
|
|
62
|
-
|
|
110
|
+
- **Production deploys** — 8-12 per day across all services.
|
|
111
|
+
- **Rollback time** — Under 60 seconds to previous version.
|
|
112
|
+
- **Feature flags** — LaunchDarkly for gradual feature rollouts and instant kill switches.
|
|
63
113
|
|
|
64
|
-
|
|
65
|
-
- **Compliance Monitoring**: Compliance monitoring
|
|
66
|
-
- **Audit Logging**: Comprehensive audit logs
|
|
67
|
-
- **Security Testing**: Regular security testing
|
|
114
|
+
---
|
|
68
115
|
|
|
69
|
-
## Global
|
|
116
|
+
## Global Performance
|
|
70
117
|
|
|
71
|
-
###
|
|
118
|
+
### Content delivery
|
|
72
119
|
|
|
73
|
-
- **
|
|
74
|
-
- **Edge
|
|
75
|
-
- **
|
|
76
|
-
- **Data Residency**: Data residency compliance
|
|
120
|
+
- **200+ edge locations** via CloudFront CDN.
|
|
121
|
+
- **Edge caching** for static assets with 85%+ cache hit rate.
|
|
122
|
+
- **Dynamic content acceleration** — Optimized TCP connections and HTTP/3 support.
|
|
77
123
|
|
|
78
|
-
###
|
|
124
|
+
### Latency optimization
|
|
79
125
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
126
|
+
| Region | Average API latency |
|
|
127
|
+
| -------------- | ------------------- |
|
|
128
|
+
| US East | 35 ms |
|
|
129
|
+
| US West | 52 ms |
|
|
130
|
+
| EU West | 41 ms |
|
|
131
|
+
| AP Southeast | 68 ms |
|
|
132
|
+
| Global average | 48 ms |
|