@adminide-stack/yantra-help-browser 12.0.16-alpha.27 → 12.0.16-alpha.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (163) hide show
  1. package/lib/components/HelpCenterFooter.d.ts.map +1 -1
  2. package/lib/components/HelpCenterFooter.js +43 -88
  3. package/lib/components/HelpCenterFooter.js.map +1 -1
  4. package/lib/components/HelpCenterHeader.d.ts.map +1 -1
  5. package/lib/components/HelpCenterHeader.js +3 -8
  6. package/lib/components/HelpCenterHeader.js.map +1 -1
  7. package/lib/components/Icons.d.ts +55 -0
  8. package/lib/components/Icons.d.ts.map +1 -0
  9. package/lib/{pages/LandingPage/components → components}/Icons.js +39 -2
  10. package/lib/components/Icons.js.map +1 -0
  11. package/lib/components/Logo.d.ts +1 -2
  12. package/lib/components/Logo.d.ts.map +1 -1
  13. package/lib/components/Logo.js +2 -9
  14. package/lib/components/Logo.js.map +1 -1
  15. package/lib/components/PageHero.d.ts +9 -0
  16. package/lib/components/PageHero.d.ts.map +1 -0
  17. package/lib/components/PageHero.js +47 -0
  18. package/lib/components/PageHero.js.map +1 -0
  19. package/lib/components/SearchBar.d.ts.map +1 -1
  20. package/lib/components/SearchBar.js +9 -45
  21. package/lib/components/SearchBar.js.map +1 -1
  22. package/lib/components/SidebarSearch.js +2 -2
  23. package/lib/components/SidebarSearch.js.map +1 -1
  24. package/lib/components/navbar/index.d.ts.map +1 -1
  25. package/lib/components/navbar/index.js +5 -16
  26. package/lib/components/navbar/index.js.map +1 -1
  27. package/lib/compute.d.ts +15 -0
  28. package/lib/compute.d.ts.map +1 -1
  29. package/lib/compute.js +74 -2
  30. package/lib/compute.js.map +1 -1
  31. package/lib/pages/About/index.d.ts +3 -0
  32. package/lib/pages/About/index.d.ts.map +1 -0
  33. package/lib/pages/About/index.js +69 -0
  34. package/lib/pages/About/index.js.map +1 -0
  35. package/lib/pages/Careers/index.d.ts +3 -0
  36. package/lib/pages/Careers/index.d.ts.map +1 -0
  37. package/lib/pages/Careers/index.js +78 -0
  38. package/lib/pages/Careers/index.js.map +1 -0
  39. package/lib/pages/CategoryCollection/index.d.ts.map +1 -1
  40. package/lib/pages/CategoryCollection/index.js +15 -78
  41. package/lib/pages/CategoryCollection/index.js.map +1 -1
  42. package/lib/pages/Community/index.d.ts +3 -0
  43. package/lib/pages/Community/index.d.ts.map +1 -0
  44. package/lib/pages/Community/index.js +72 -0
  45. package/lib/pages/Community/index.js.map +1 -0
  46. package/lib/pages/Contact/index.d.ts +3 -0
  47. package/lib/pages/Contact/index.d.ts.map +1 -0
  48. package/lib/pages/Contact/index.js +128 -0
  49. package/lib/pages/Contact/index.js.map +1 -0
  50. package/lib/pages/GetStarted/components/ExampleCard.d.ts.map +1 -1
  51. package/lib/pages/GetStarted/index.d.ts.map +1 -1
  52. package/lib/pages/GetStarted/index.js +9 -45
  53. package/lib/pages/GetStarted/index.js.map +1 -1
  54. package/lib/pages/HelpCenter/components/HelpCategoryCard.d.ts.map +1 -1
  55. package/lib/pages/HelpCenter/components/HelpCategoryCard.js +7 -34
  56. package/lib/pages/HelpCenter/components/HelpCategoryCard.js.map +1 -1
  57. package/lib/pages/HelpCenter/components/PopularArticle.d.ts.map +1 -1
  58. package/lib/pages/HelpCenter/components/PopularArticle.js +3 -12
  59. package/lib/pages/HelpCenter/components/PopularArticle.js.map +1 -1
  60. package/lib/pages/HelpCenter/index.d.ts.map +1 -1
  61. package/lib/pages/HelpCenter/index.js +15 -74
  62. package/lib/pages/HelpCenter/index.js.map +1 -1
  63. package/lib/pages/LandingPage/components/ArticleCard.d.ts.map +1 -1
  64. package/lib/pages/LandingPage/components/ArticleCard.js +3 -12
  65. package/lib/pages/LandingPage/components/ArticleCard.js.map +1 -1
  66. package/lib/pages/LandingPage/components/CategoryCard.d.ts.map +1 -1
  67. package/lib/pages/LandingPage/components/CategoryCard.js +3 -12
  68. package/lib/pages/LandingPage/components/CategoryCard.js.map +1 -1
  69. package/lib/pages/LandingPage/components/HeroSection.d.ts.map +1 -1
  70. package/lib/pages/LandingPage/components/HeroSection.js +3 -8
  71. package/lib/pages/LandingPage/components/HeroSection.js.map +1 -1
  72. package/lib/pages/LandingPage/components/ResourceCard.d.ts.map +1 -1
  73. package/lib/pages/LandingPage/components/ResourceCard.js +3 -12
  74. package/lib/pages/LandingPage/components/ResourceCard.js.map +1 -1
  75. package/lib/pages/LandingPage/index.d.ts.map +1 -1
  76. package/lib/pages/LandingPage/index.js +62 -33
  77. package/lib/pages/LandingPage/index.js.map +1 -1
  78. package/lib/pages/Markdown/MarkdownPageLayout.d.ts.map +1 -1
  79. package/lib/pages/Markdown/MarkdownPageLayout.js +11 -56
  80. package/lib/pages/Markdown/MarkdownPageLayout.js.map +1 -1
  81. package/lib/pages/Privacy/index.d.ts +3 -0
  82. package/lib/pages/Privacy/index.d.ts.map +1 -0
  83. package/lib/pages/Privacy/index.js +449 -0
  84. package/lib/pages/Privacy/index.js.map +1 -0
  85. package/lib/pages/ReleaseNotes/index.d.ts +3 -0
  86. package/lib/pages/ReleaseNotes/index.d.ts.map +1 -0
  87. package/lib/pages/ReleaseNotes/index.js +61 -0
  88. package/lib/pages/ReleaseNotes/index.js.map +1 -0
  89. package/lib/pages/StatusPage/index.d.ts +3 -0
  90. package/lib/pages/StatusPage/index.d.ts.map +1 -0
  91. package/lib/pages/StatusPage/index.js +85 -0
  92. package/lib/pages/StatusPage/index.js.map +1 -0
  93. package/lib/pages/Terms/index.d.ts +3 -0
  94. package/lib/pages/Terms/index.d.ts.map +1 -0
  95. package/lib/pages/Terms/index.js +60 -0
  96. package/lib/pages/Terms/index.js.map +1 -0
  97. package/lib/routes.json +88 -0
  98. package/lib/templates/content/account-management/account-setup.md +56 -57
  99. package/lib/templates/content/account-management/delete-account.md +55 -58
  100. package/lib/templates/content/account-management/preferences.md +70 -55
  101. package/lib/templates/content/account-management/privacy-settings.md +67 -56
  102. package/lib/templates/content/account-management/profile-settings.md +56 -45
  103. package/lib/templates/content/browser-extension/browser-extension-overview.md +60 -0
  104. package/lib/templates/content/browser-extension/extension-security.md +126 -0
  105. package/lib/templates/content/browser-extension/getting-started-extension.md +96 -0
  106. package/lib/templates/content/browser-extension/how-it-works.md +112 -0
  107. package/lib/templates/content/browser-extension/troubleshooting-extension.md +138 -0
  108. package/lib/templates/content/browser-extension/use-cases-workflows.md +123 -0
  109. package/lib/templates/content/content-manifest.json +328 -0
  110. package/lib/templates/content/data-privacy/data-collection.md +55 -51
  111. package/lib/templates/content/data-privacy/privacy-policy.md +74 -57
  112. package/lib/templates/content/data-subject-privacy/data-access.md +59 -69
  113. package/lib/templates/content/data-subject-privacy/data-portability.md +67 -93
  114. package/lib/templates/content/data-subject-privacy/privacy-requests.md +66 -62
  115. package/lib/templates/content/file-uploads/file-upload-overview.md +69 -50
  116. package/lib/templates/content/getting-started/getting-started-guide.md +65 -51
  117. package/lib/templates/content/product-features/ai-models.md +62 -57
  118. package/lib/templates/content/product-features/collaboration-tools.md +66 -45
  119. package/lib/templates/content/product-features/conversation-features.md +59 -34
  120. package/lib/templates/content/product-features/export-features.md +65 -45
  121. package/lib/templates/content/product-features/follow-up-questions.md +61 -46
  122. package/lib/templates/content/product-features/real-time-search.md +63 -46
  123. package/lib/templates/content/product-features/saved-searches.md +53 -45
  124. package/lib/templates/content/product-features/search-features.md +61 -43
  125. package/lib/templates/content/product-features/search-history.md +64 -45
  126. package/lib/templates/content/product-features/source-citations.md +77 -44
  127. package/lib/templates/content/scope-api/api-overview.md +83 -46
  128. package/lib/templates/content/search-modes/deep-research.md +87 -46
  129. package/lib/templates/content/search-modes/labs-features.md +54 -56
  130. package/lib/templates/content/search-modes/pro-search.md +53 -45
  131. package/lib/templates/content/search-modes/regular-search.md +64 -46
  132. package/lib/templates/content/spaces-library/spaces-overview.md +68 -47
  133. package/lib/templates/content/student-hub/academic-research.md +85 -48
  134. package/lib/templates/content/student-hub/student-discounts.md +51 -57
  135. package/lib/templates/content/student-hub/student-overview.md +64 -45
  136. package/lib/templates/content/student-hub/study-tools.md +99 -45
  137. package/lib/templates/content/subscription-billing/billing-cycle.md +50 -48
  138. package/lib/templates/content/subscription-billing/billing-overview.md +55 -34
  139. package/lib/templates/content/subscription-billing/billing-support.md +52 -45
  140. package/lib/templates/content/subscription-billing/currency-support.md +36 -62
  141. package/lib/templates/content/subscription-billing/enterprise-pricing.md +52 -46
  142. package/lib/templates/content/subscription-billing/invoice-management.md +64 -45
  143. package/lib/templates/content/subscription-billing/payment-methods.md +53 -52
  144. package/lib/templates/content/subscription-billing/promotional-offers.md +58 -46
  145. package/lib/templates/content/subscription-billing/refund-policy.md +53 -58
  146. package/lib/templates/content/subscription-billing/student-discounts.md +61 -57
  147. package/lib/templates/content/subscription-billing/tax-information.md +64 -45
  148. package/lib/templates/content/technical-questions/ai-models-technical.md +60 -56
  149. package/lib/templates/content/technical-questions/api-technical.md +131 -56
  150. package/lib/templates/content/technical-questions/data-processing.md +74 -55
  151. package/lib/templates/content/technical-questions/database-architecture.md +91 -54
  152. package/lib/templates/content/technical-questions/infrastructure.md +104 -55
  153. package/lib/templates/content/technical-questions/performance-optimization.md +77 -55
  154. package/lib/templates/content/technical-questions/search-algorithms.md +78 -56
  155. package/lib/templates/content/technical-questions/technical-overview.md +76 -52
  156. package/lib/templates/content/threads/conversation-management.md +81 -45
  157. package/lib/templates/content/threads/threads-overview.md +69 -45
  158. package/lib/templates/content/troubleshooting/common-issues.md +91 -43
  159. package/lib/templates/content/what-is-yantra/getting-started-yantra.md +58 -57
  160. package/package.json +2 -2
  161. package/lib/pages/LandingPage/components/Icons.d.ts +0 -13
  162. package/lib/pages/LandingPage/components/Icons.d.ts.map +0 -1
  163. package/lib/pages/LandingPage/components/Icons.js.map +0 -1
@@ -1,83 +1,102 @@
1
- # Data Processing
1
+ # Data Processing Pipeline
2
2
 
3
- How Yantra processes and analyzes data.
3
+ How Yantra ingests, transforms, analyzes, and stores your data — from raw input to search-ready content.
4
4
 
5
- ## Data Pipeline
5
+ ---
6
6
 
7
- ### Data Ingestion
7
+ ## Data Ingestion
8
8
 
9
- - **Web Crawling**: Automated web content crawling
10
- - **API Integration**: Third-party API data ingestion
11
- - **User Input**: User-generated content processing
12
- - **File Uploads**: Document and file processing
9
+ Yantra supports multiple ingestion pathways to accommodate diverse data sources:
13
10
 
14
- ### Data Processing
11
+ | Source | Method | Details |
12
+ | ------------ | ----------------------- | ----------------------------------------------------------------------- |
13
+ | Web content | Automated crawling | Configurable crawl schedules, depth limits, and domain allowlists |
14
+ | APIs | REST/GraphQL connectors | Pre-built connectors for 50+ services (Slack, Notion, Confluence, etc.) |
15
+ | File uploads | Drag-and-drop or API | Supports PDF, DOCX, PPTX, CSV, Markdown, HTML, and plain text |
16
+ | Databases | Direct connectors | PostgreSQL, MySQL, MongoDB read-only connectors with incremental sync |
15
17
 
16
- - **Content Extraction**: Text and content extraction
17
- - **Data Cleaning**: Data cleaning and normalization
18
- - **Data Enrichment**: Data enrichment and enhancement
19
- - **Data Validation**: Data quality validation
18
+ ### Ingestion guarantees
19
+
20
+ - **Exactly-once processing** Deduplication ensures no content is indexed twice.
21
+ - **Incremental updates** Only new or modified content is re-processed, minimizing compute costs.
22
+ - **Schema validation** — Incoming data is validated against expected schemas before entering the pipeline.
23
+
24
+ ---
20
25
 
21
26
  ## Processing Architecture
22
27
 
23
- ### Stream Processing
28
+ ### Stream processing
29
+
30
+ Yantra uses an event-driven architecture built on **Apache Kafka** for real-time data flow:
31
+
32
+ 1. **Producers** publish raw content events to topic partitions.
33
+ 2. **Stream processors** (Kafka Streams) consume events, apply transformations, and emit enriched records.
34
+ 3. **Consumers** write processed data to the appropriate storage layer (PostgreSQL, Elasticsearch, Redis).
35
+
36
+ ### Batch processing
24
37
 
25
- - **Real-time Processing**: Real-time data processing
26
- - **Event Streaming**: Event-driven processing
27
- - **Message Queues**: Asynchronous processing
28
- - **Batch Processing**: Batch data processing
38
+ For large-volume imports or periodic re-indexing, Yantra runs batch jobs using distributed task queues:
29
39
 
30
- ### Data Transformation
40
+ - Jobs are broken into chunks of 1,000 records.
41
+ - Each chunk is processed in parallel across worker nodes.
42
+ - Progress is tracked in real-time and visible in the Admin Dashboard.
31
43
 
32
- - **ETL Processes**: Extract, transform, load
33
- - **Data Mapping**: Data field mapping
34
- - **Format Conversion**: Data format conversion
35
- - **Schema Evolution**: Schema management
44
+ ---
36
45
 
37
46
  ## Content Analysis
38
47
 
39
- ### Text Processing
48
+ Every piece of content passes through a multi-stage analysis pipeline:
40
49
 
41
- - **Natural Language Processing**: NLP processing
42
- - **Sentiment Analysis**: Sentiment detection
43
- - **Topic Modeling**: Topic extraction
44
- - **Entity Recognition**: Named entity recognition
50
+ ### Text processing
45
51
 
46
- ### Content Classification
52
+ - **Language detection** — Automatically identifies 50+ languages.
53
+ - **Tokenization** — Text is split into meaningful tokens using language-specific tokenizers.
54
+ - **Normalization** — Unicode normalization, lowercasing, and stop-word removal.
47
55
 
48
- - **Content Categorization**: Automatic categorization
49
- - **Quality Assessment**: Content quality scoring
50
- - **Relevance Scoring**: Relevance assessment
51
- - **Duplicate Detection**: Duplicate content detection
56
+ ### Semantic analysis
52
57
 
53
- ## Data Storage
58
+ - **Embedding generation** — Each document is converted to a high-dimensional vector using state-of-the-art embedding models.
59
+ - **Topic modeling** — Latent topics are extracted using LDA and transformer-based approaches.
60
+ - **Entity recognition** — People, organizations, dates, locations, and custom entities are identified and tagged.
61
+ - **Sentiment analysis** — Content sentiment (positive, negative, neutral) is scored for analytics.
62
+
63
+ ### Quality scoring
64
+
65
+ | Signal | Weight | Description |
66
+ | ---------------- | ------ | ------------------------------------------------ |
67
+ | Completeness | 25% | Does the content cover its topic thoroughly? |
68
+ | Freshness | 20% | How recently was the content created or updated? |
69
+ | Source authority | 30% | How trustworthy is the source domain? |
70
+ | Readability | 15% | Flesch-Kincaid score and structural quality |
71
+ | Uniqueness | 10% | Duplicate and near-duplicate detection |
54
72
 
55
- ### Storage Systems
73
+ ---
74
+
75
+ ## Data Storage
56
76
 
57
- - **Primary Storage**: PostgreSQL database
58
- - **Search Storage**: Elasticsearch index
59
- - **Cache Storage**: Redis cache
60
- - **File Storage**: Object storage system
77
+ ### Multi-tier storage architecture
61
78
 
62
- ### Data Management
79
+ | Tier | Technology | Purpose | Retention |
80
+ | ---- | ---------------------------- | ------------------------------------------ | -------------------- |
81
+ | Hot | PostgreSQL + Redis | Active queries, user data, real-time cache | Indefinite |
82
+ | Warm | Elasticsearch | Full-text search index, vector search | Indefinite |
83
+ | Cold | S3-compatible object storage | Archived content, raw uploads, backups | Per retention policy |
63
84
 
64
- - **Data Lifecycle**: Data lifecycle management
65
- - **Data Retention**: Data retention policies
66
- - **Data Archival**: Data archival processes
67
- - **Data Deletion**: Secure data deletion
85
+ ### Data lifecycle
68
86
 
69
- ## Performance Optimization
87
+ 1. **Ingest** — Raw data lands in the processing queue.
88
+ 2. **Process** — Content is analyzed, enriched, and indexed.
89
+ 3. **Serve** — Processed data is available for search and retrieval.
90
+ 4. **Archive** — Older content is compressed and moved to cold storage based on access patterns.
91
+ 5. **Delete** — Data past its retention period is securely purged (overwritten + cryptographic erasure).
70
92
 
71
- ### Processing Performance
93
+ ---
72
94
 
73
- - **Parallel Processing**: Multi-threaded processing
74
- - **Distributed Processing**: Distributed computing
75
- - **Caching**: Intelligent caching strategies
76
- - **Optimization**: Performance optimization
95
+ ## Performance & Scalability
77
96
 
78
- ### Scalability
97
+ - **Throughput** — The pipeline processes 10,000+ documents per minute at steady state.
98
+ - **Latency** — End-to-end ingestion-to-searchable time is under 30 seconds for real-time sources.
99
+ - **Horizontal scaling** — Worker nodes auto-scale based on queue depth. During peak loads, the system scales from 4 to 32 workers automatically.
100
+ - **Backpressure handling** — If downstream systems slow down, the pipeline applies backpressure to producers rather than dropping data.
79
101
 
80
- - **Horizontal Scaling**: Scale-out architecture
81
- - **Load Distribution**: Load balancing
82
- - **Resource Management**: Resource optimization
83
- - **Auto-scaling**: Automatic scaling
102
+ > **Enterprise customers** can configure custom processing rules, retention policies, and data routing via the Admin Dashboard.
@@ -1,83 +1,120 @@
1
1
  # Database Architecture
2
2
 
3
- Technical details about data storage and retrieval.
3
+ How Yantra stores, indexes, and retrieves data across multiple database systems optimized for different access patterns.
4
4
 
5
- ## Database Design
5
+ ---
6
6
 
7
- ### Primary Database
7
+ ## Database Design Philosophy
8
8
 
9
- - **PostgreSQL**: Primary relational database
10
- - **Schema Design**: Optimized database schema
11
- - **Indexing Strategy**: Comprehensive indexing
12
- - **Partitioning**: Database partitioning
9
+ Yantra follows a **polyglot persistence** strategy — each data type is stored in the database engine best suited for its access pattern:
13
10
 
14
- ### Data Models
11
+ | Data type | Engine | Why |
12
+ | ------------------------------ | --------------------- | ------------------------------------------------------- |
13
+ | User accounts, billing, config | PostgreSQL 16 | ACID transactions, relational integrity, mature tooling |
14
+ | Full-text + vector search | Elasticsearch 8.x | Inverted indexes, BM25 ranking, kNN vector search |
15
+ | Sessions, cache, rate limits | Redis 7 | Sub-millisecond reads, TTL support, pub/sub |
16
+ | File uploads, backups | S3-compatible storage | Virtually unlimited capacity, 11 nines durability |
15
17
 
16
- - **User Data**: User account and profile data
17
- - **Content Data**: Search content and metadata
18
- - **Usage Data**: User interaction and analytics data
19
- - **System Data**: System configuration and logs
18
+ ---
20
19
 
21
- ## Storage Systems
20
+ ## PostgreSQL — Primary Database
22
21
 
23
- ### Multi-tier Storage
22
+ ### Schema design principles
24
23
 
25
- - **Hot Storage**: Frequently accessed data
26
- - **Warm Storage**: Moderately accessed data
27
- - **Cold Storage**: Archive and backup data
28
- - **Cache Layer**: High-speed cache storage
24
+ - **Normalized core tables** Users, organizations, subscriptions, and permissions follow 3NF to avoid data anomalies.
25
+ - **JSONB for flexibility** Metadata, user preferences, and integration configs use JSONB columns, combining schema flexibility with indexing support.
26
+ - **Timestamped everything** Every table includes `created_at` and `updated_at` columns with timezone-aware timestamps.
27
+ - **Soft deletes** Records are marked as deleted rather than physically removed, enabling audit trails and data recovery.
29
28
 
30
- ### Data Distribution
29
+ ### Indexing strategy
31
30
 
32
- - **Sharding**: Horizontal data partitioning
33
- - **Replication**: Data replication for availability
34
- - **Backup**: Automated backup systems
35
- - **Recovery**: Disaster recovery procedures
31
+ | Index type | Use case | Example |
32
+ | ---------- | -------------------------- | -------------------------------------------------- |
33
+ | B-tree | Equality and range queries | `WHERE created_at > '2026-01-01'` |
34
+ | GIN | JSONB containment queries | `WHERE metadata @> '{"type": "pdf"}'` |
35
+ | Partial | Hot data subsets | `WHERE status = 'active'` (index only active rows) |
36
+ | Covering | Avoid table lookups | Include all `SELECT` columns in the index |
36
37
 
37
- ## Query Optimization
38
+ ### High availability
38
39
 
39
- ### Performance Tuning
40
+ - **Streaming replication** — One synchronous standby + two async replicas.
41
+ - **Automatic failover** — Patroni manages leader election; failover completes in < 10 seconds.
42
+ - **Point-in-time recovery** — WAL archiving enables recovery to any second within the retention window (30 days).
40
43
 
41
- - **Query Analysis**: Query performance analysis
42
- - **Index Optimization**: Database index optimization
43
- - **Execution Plans**: Query execution optimization
44
- - **Connection Pooling**: Database connection pooling
44
+ ---
45
45
 
46
- ### Caching Strategy
46
+ ## Elasticsearch — Search Index
47
47
 
48
- - **Query Caching**: Database query caching
49
- - **Result Caching**: Application-level caching
50
- - **CDN Caching**: Content delivery network caching
51
- - **Distributed Caching**: Distributed cache systems
48
+ ### Index architecture
49
+
50
+ Each content type has its own Elasticsearch index with optimized mappings:
51
+
52
+ - **Text fields** use `text` type with custom analyzers (language-specific stemming, synonym expansion).
53
+ - **Vector fields** use `dense_vector` type for kNN semantic search.
54
+ - **Keyword fields** for exact-match filtering (tags, content type, source).
55
+ - **Date fields** for time-range queries and recency boosting.
56
+
57
+ ### Cluster topology
58
+
59
+ - **3 dedicated master nodes** for cluster coordination.
60
+ - **6+ data nodes** with SSD storage for search performance.
61
+ - **2 coordinating nodes** for query routing and result aggregation.
62
+
63
+ ---
64
+
65
+ ## Redis — Cache & Real-Time
66
+
67
+ ### Cache patterns
68
+
69
+ | Pattern | Use case | TTL |
70
+ | ------------------- | ---------------------------- | ----------------------- |
71
+ | Query result cache | Identical search queries | 5 minutes |
72
+ | Session store | User authentication sessions | 1 hour |
73
+ | Rate limit counters | API rate limiting | Rolling 1-minute window |
74
+ | Pub/Sub channels | Real-time notifications | N/A (ephemeral) |
75
+
76
+ ### Memory management
77
+
78
+ - **Maxmemory policy** set to `allkeys-lru` — least recently used keys are evicted when memory limits are reached.
79
+ - **Key namespacing** — All keys are prefixed by service name to avoid collisions.
80
+ - **Cluster mode** — Redis Cluster with 6 nodes (3 primary + 3 replica) for horizontal scaling.
81
+
82
+ ---
52
83
 
53
84
  ## Data Management
54
85
 
55
- ### Data Lifecycle
86
+ ### Backup strategy
87
+
88
+ | What | Frequency | Retention | Method |
89
+ | ------------- | --------------------------- | ---------- | ------------------------------- |
90
+ | PostgreSQL | Continuous WAL + daily full | 30 days | pg_basebackup + WAL archiving |
91
+ | Elasticsearch | Daily snapshots | 14 days | Snapshot to S3 |
92
+ | Redis | RDB snapshots + AOF | 7 days | Automated via Redis persistence |
93
+ | File storage | Cross-region replication | Indefinite | S3 cross-region replication |
56
94
 
57
- - **Data Ingestion**: Data collection and ingestion
58
- - **Data Processing**: Data transformation and processing
59
- - **Data Storage**: Data storage and organization
60
- - **Data Archival**: Data archival and cleanup
95
+ ### Data governance
61
96
 
62
- ### Data Quality
97
+ - **Encryption at rest** — AES-256 for all database storage volumes.
98
+ - **Encryption in transit** — TLS 1.3 for all database connections.
99
+ - **Access control** — Database credentials are rotated monthly via HashiCorp Vault.
100
+ - **Audit logging** — All schema changes and administrative queries are logged.
63
101
 
64
- - **Data Validation**: Data quality validation
65
- - **Data Cleaning**: Data cleaning and normalization
66
- - **Data Monitoring**: Data quality monitoring
67
- - **Data Governance**: Data governance policies
102
+ ---
68
103
 
69
104
  ## Scalability
70
105
 
71
- ### Horizontal Scaling
106
+ ### Current capacity
72
107
 
73
- - **Database Sharding**: Horizontal database scaling
74
- - **Read Replicas**: Read-only database replicas
75
- - **Load Distribution**: Database load distribution
76
- - **Auto-scaling**: Automatic database scaling
108
+ | Metric | Value |
109
+ | --------------------------------- | ------ |
110
+ | Total indexed documents | 500M+ |
111
+ | Database size (PostgreSQL) | 2.4 TB |
112
+ | Search index size (Elasticsearch) | 8.7 TB |
113
+ | Peak queries per second | 12,000 |
114
+ | Average query latency | 45 ms |
77
115
 
78
- ### Performance Monitoring
116
+ ### Scaling strategy
79
117
 
80
- - **Database Metrics**: Database performance metrics
81
- - **Query Monitoring**: Query performance monitoring
82
- - **Resource Monitoring**: Database resource monitoring
83
- - **Alerting**: Database performance alerting
118
+ - **Vertical** Increase instance sizes for immediate capacity (database-level).
119
+ - **Horizontal** Add read replicas (PostgreSQL), data nodes (Elasticsearch), or shard nodes (Redis) for linear scaling.
120
+ - **Partitioning** Time-based partitioning for PostgreSQL tables with high write volume.
@@ -1,83 +1,132 @@
1
- # Infrastructure
1
+ # Infrastructure & Reliability
2
2
 
3
- Yantra's technical infrastructure and scalability.
3
+ Yantra's cloud infrastructure is designed for high availability, global performance, and security at every layer.
4
4
 
5
- ## Cloud Infrastructure
5
+ ---
6
6
 
7
- ### Cloud Architecture
7
+ ## Cloud Architecture
8
8
 
9
- - **Multi-cloud**: Multi-cloud deployment strategy
10
- - **Microservices**: Microservices architecture
11
- - **Containerization**: Docker containerization
12
- - **Orchestration**: Kubernetes orchestration
9
+ ### Multi-region deployment
13
10
 
14
- ### Infrastructure Components
11
+ Yantra runs across **3 AWS regions** (US East, EU West, AP Southeast) with active-active configuration. User traffic is routed to the nearest region via latency-based DNS routing.
15
12
 
16
- - **Compute**: Scalable compute resources
17
- - **Storage**: Distributed storage systems
18
- - **Networking**: High-performance networking
19
- - **Security**: Infrastructure security
13
+ ### Container orchestration
20
14
 
21
- ## Scalability Design
15
+ All services run as Docker containers orchestrated by **Kubernetes (EKS)**:
22
16
 
23
- ### Horizontal Scaling
17
+ - **Namespaces** isolate production, staging, and development environments.
18
+ - **Resource quotas** prevent any single service from consuming excessive cluster resources.
19
+ - **Rolling deployments** ensure zero-downtime updates with automatic rollback on health check failures.
20
+ - **Horizontal Pod Autoscaler** adjusts replica counts based on CPU, memory, and custom metrics.
24
21
 
25
- - **Auto-scaling**: Automatic scaling based on demand
26
- - **Load Balancing**: Advanced load balancing
27
- - **Distributed Systems**: Distributed architecture
28
- - **Resource Pooling**: Resource pooling strategies
22
+ ### Service mesh
29
23
 
30
- ### Performance Optimization
24
+ An Istio-based service mesh provides:
31
25
 
32
- - **Caching**: Multi-layer caching
33
- - **CDN**: Content delivery network
34
- - **Database Optimization**: Database performance tuning
35
- - **Network Optimization**: Network performance optimization
26
+ - **Mutual TLS** between all services (zero-trust networking).
27
+ - **Traffic management** Canary deployments, circuit breaking, retry policies.
28
+ - **Observability** Distributed tracing with Jaeger, metrics with Prometheus.
29
+
30
+ ---
36
31
 
37
32
  ## High Availability
38
33
 
39
- ### Availability Design
34
+ ### Availability targets
35
+
36
+ | Component | Target SLA | Actual (trailing 12 months) |
37
+ | -------------- | ---------- | --------------------------- |
38
+ | API Gateway | 99.99% | 99.995% |
39
+ | Search Service | 99.95% | 99.98% |
40
+ | AI Service | 99.9% | 99.94% |
41
+ | Data Pipeline | 99.9% | 99.92% |
42
+
43
+ ### Redundancy design
44
+
45
+ - **No single point of failure** — Every component has at least 2 replicas across different availability zones.
46
+ - **Database failover** — Automated failover with < 10-second recovery for PostgreSQL and Redis.
47
+ - **Cross-region replication** — Critical data is replicated across regions for disaster recovery.
48
+ - **Graceful degradation** — If the AI service is unavailable, search still returns results without AI-generated summaries.
49
+
50
+ ### Disaster recovery
51
+
52
+ | Metric | Target | Actual |
53
+ | ------------------------------ | --------- | ---------- |
54
+ | Recovery Time Objective (RTO) | < 4 hours | 2.1 hours |
55
+ | Recovery Point Objective (RPO) | < 1 hour | 15 minutes |
56
+ | DR test frequency | Quarterly | Monthly |
57
+
58
+ ---
59
+
60
+ ## Monitoring & Observability
61
+
62
+ ### The three pillars
40
63
 
41
- - **Redundancy**: System redundancy
42
- - **Failover**: Automatic failover
43
- - **Disaster Recovery**: Disaster recovery procedures
44
- - **Backup Systems**: Comprehensive backup systems
64
+ | Pillar | Tools | Details |
65
+ | ------- | -------------------------------- | ------------------------------------------------ |
66
+ | Metrics | Prometheus + Grafana | 2,000+ custom metrics, 15-second scrape interval |
67
+ | Logs | Fluentd + Elasticsearch + Kibana | Structured JSON logs, 30-day retention |
68
+ | Traces | Jaeger + OpenTelemetry | End-to-end request tracing across all services |
45
69
 
46
- ### Monitoring
70
+ ### Alerting
47
71
 
48
- - **Health Monitoring**: System health monitoring
49
- - **Performance Monitoring**: Performance metrics
50
- - **Alerting**: Automated alerting systems
51
- - **Logging**: Comprehensive logging
72
+ - **PagerDuty integration** for critical alerts (P1/P2) with automatic escalation.
73
+ - **Slack notifications** for warnings and informational alerts.
74
+ - **Anomaly detection** — ML-based alerting detects unusual patterns before they become incidents.
75
+ - **Runbooks** Every alert links to a runbook with diagnosis steps and remediation procedures.
76
+
77
+ ---
52
78
 
53
79
  ## Security Infrastructure
54
80
 
55
- ### Security Measures
81
+ ### Network security
82
+
83
+ - **VPC isolation** — All services run in private subnets with no direct internet access.
84
+ - **WAF (Web Application Firewall)** — Protects against OWASP Top 10 threats.
85
+ - **DDoS protection** — AWS Shield Advanced with automatic traffic scrubbing.
86
+ - **Egress filtering** — Outbound traffic is restricted to known-good destinations.
87
+
88
+ ### Secrets management
89
+
90
+ - **HashiCorp Vault** for all secrets, API keys, and database credentials.
91
+ - **Automatic rotation** — Secrets are rotated on configurable schedules (default: 30 days).
92
+ - **Just-in-time access** — Engineers request temporary elevated access via an approval workflow.
93
+
94
+ ---
95
+
96
+ ## CI/CD Pipeline
97
+
98
+ ### Deployment flow
99
+
100
+ 1. **Code push** — Developer pushes to a feature branch on GitHub.
101
+ 2. **CI checks** — Automated linting, type checking, unit tests, and integration tests run in GitHub Actions.
102
+ 3. **Build** — Docker images are built, scanned for vulnerabilities, and pushed to ECR.
103
+ 4. **Staging deploy** — ArgoCD deploys to the staging environment automatically.
104
+ 5. **QA validation** — Automated end-to-end tests + manual spot checks.
105
+ 6. **Production deploy** — Canary deployment to 5% of traffic, then gradual rollout to 100%.
106
+ 7. **Post-deploy monitoring** — Automated health checks verify error rates and latency for 30 minutes.
56
107
 
57
- - **Network Security**: Network-level security
58
- - **Application Security**: Application security
59
- - **Data Security**: Data protection measures
60
- - **Access Control**: Access control systems
108
+ ### Deployment frequency
61
109
 
62
- ### Compliance
110
+ - **Production deploys** — 8-12 per day across all services.
111
+ - **Rollback time** — Under 60 seconds to previous version.
112
+ - **Feature flags** — LaunchDarkly for gradual feature rollouts and instant kill switches.
63
113
 
64
- - **Security Standards**: Industry security standards
65
- - **Compliance Monitoring**: Compliance monitoring
66
- - **Audit Logging**: Comprehensive audit logs
67
- - **Security Testing**: Regular security testing
114
+ ---
68
115
 
69
- ## Global Infrastructure
116
+ ## Global Performance
70
117
 
71
- ### Geographic Distribution
118
+ ### Content delivery
72
119
 
73
- - **Multi-region**: Multi-region deployment
74
- - **Edge Computing**: Edge computing capabilities
75
- - **Latency Optimization**: Low-latency optimization
76
- - **Data Residency**: Data residency compliance
120
+ - **200+ edge locations** via CloudFront CDN.
121
+ - **Edge caching** for static assets with 85%+ cache hit rate.
122
+ - **Dynamic content acceleration** — Optimized TCP connections and HTTP/3 support.
77
123
 
78
- ### Network Architecture
124
+ ### Latency optimization
79
125
 
80
- - **Global Network**: Global network infrastructure
81
- - **Peering**: Network peering agreements
82
- - **Traffic Management**: Intelligent traffic management
83
- - **Bandwidth**: High-bandwidth connectivity
126
+ | Region | Average API latency |
127
+ | -------------- | ------------------- |
128
+ | US East | 35 ms |
129
+ | US West | 52 ms |
130
+ | EU West | 41 ms |
131
+ | AP Southeast | 68 ms |
132
+ | Global average | 48 ms |