npm - @adminide-stack/yantra-help-browser - Versions diffs - 12.0.16-alpha.26 → 12.0.16-alpha.28 - Mend

@adminide-stack/yantra-help-browser 12.0.16-alpha.26 → 12.0.16-alpha.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (156) hide show

package/lib/templates/content/technical-questions/infrastructure.md CHANGED Viewed

@@ -1,83 +1,132 @@
-# Infrastructure
+# Infrastructure & Reliability
-Yantra's technical infrastructure and scalability.
+Yantra's cloud infrastructure is designed for high availability, global performance, and security at every layer.
-## Cloud Infrastructure
+---
-### Cloud Architecture
+## Cloud Architecture
-- **Multi-cloud**: Multi-cloud deployment strategy
-- **Microservices**: Microservices architecture
-- **Containerization**: Docker containerization
-- **Orchestration**: Kubernetes orchestration
+### Multi-region deployment
-### Infrastructure Components
+Yantra runs across **3 AWS regions** (US East, EU West, AP Southeast) with active-active configuration. User traffic is routed to the nearest region via latency-based DNS routing.
-- **Compute**: Scalable compute resources
-- **Storage**: Distributed storage systems
-- **Networking**: High-performance networking
-- **Security**: Infrastructure security
+### Container orchestration
-## Scalability Design
+All services run as Docker containers orchestrated by **Kubernetes (EKS)**:
-### Horizontal Scaling
+- **Namespaces** isolate production, staging, and development environments.
+- **Resource quotas** prevent any single service from consuming excessive cluster resources.
+- **Rolling deployments** ensure zero-downtime updates with automatic rollback on health check failures.
+- **Horizontal Pod Autoscaler** adjusts replica counts based on CPU, memory, and custom metrics.
-- **Auto-scaling**: Automatic scaling based on demand
-- **Load Balancing**: Advanced load balancing
-- **Distributed Systems**: Distributed architecture
-- **Resource Pooling**: Resource pooling strategies
+### Service mesh
-### Performance Optimization
+An Istio-based service mesh provides:
-- **Caching**: Multi-layer caching
-- **CDN**: Content delivery network
-- **Database Optimization**: Database performance tuning
-- **Network Optimization**: Network performance optimization
+- **Mutual TLS** between all services (zero-trust networking).
+- **Traffic management** — Canary deployments, circuit breaking, retry policies.
+- **Observability** — Distributed tracing with Jaeger, metrics with Prometheus.
+---
 ## High Availability
-### Availability Design
+### Availability targets
+| Component      | Target SLA | Actual (trailing 12 months) |
+| -------------- | ---------- | --------------------------- |
+| API Gateway    | 99.99%     | 99.995%                     |
+| Search Service | 99.95%     | 99.98%                      |
+| AI Service     | 99.9%      | 99.94%                      |
+| Data Pipeline  | 99.9%      | 99.92%                      |
+### Redundancy design
+- **No single point of failure** — Every component has at least 2 replicas across different availability zones.
+- **Database failover** — Automated failover with < 10-second recovery for PostgreSQL and Redis.
+- **Cross-region replication** — Critical data is replicated across regions for disaster recovery.
+- **Graceful degradation** — If the AI service is unavailable, search still returns results without AI-generated summaries.
+### Disaster recovery
+| Metric                         | Target    | Actual     |
+| ------------------------------ | --------- | ---------- |
+| Recovery Time Objective (RTO)  | < 4 hours | 2.1 hours  |
+| Recovery Point Objective (RPO) | < 1 hour  | 15 minutes |
+| DR test frequency              | Quarterly | Monthly    |
+---
+## Monitoring & Observability
+### The three pillars
-- **Redundancy**: System redundancy
-- **Failover**: Automatic failover
-- **Disaster Recovery**: Disaster recovery procedures
-- **Backup Systems**: Comprehensive backup systems
+| Pillar  | Tools                            | Details                                          |
+| ------- | -------------------------------- | ------------------------------------------------ |
+| Metrics | Prometheus + Grafana             | 2,000+ custom metrics, 15-second scrape interval |
+| Logs    | Fluentd + Elasticsearch + Kibana | Structured JSON logs, 30-day retention           |
+| Traces  | Jaeger + OpenTelemetry           | End-to-end request tracing across all services   |
-### Monitoring
+### Alerting
-- **Health Monitoring**: System health monitoring
-- **Performance Monitoring**: Performance metrics
-- **Alerting**: Automated alerting systems
-- **Logging**: Comprehensive logging
+- **PagerDuty integration** for critical alerts (P1/P2) with automatic escalation.
+- **Slack notifications** for warnings and informational alerts.
+- **Anomaly detection** — ML-based alerting detects unusual patterns before they become incidents.
+- **Runbooks** — Every alert links to a runbook with diagnosis steps and remediation procedures.
+---
 ## Security Infrastructure
-### Security Measures
+### Network security
+- **VPC isolation** — All services run in private subnets with no direct internet access.
+- **WAF (Web Application Firewall)** — Protects against OWASP Top 10 threats.
+- **DDoS protection** — AWS Shield Advanced with automatic traffic scrubbing.
+- **Egress filtering** — Outbound traffic is restricted to known-good destinations.
+### Secrets management
+- **HashiCorp Vault** for all secrets, API keys, and database credentials.
+- **Automatic rotation** — Secrets are rotated on configurable schedules (default: 30 days).
+- **Just-in-time access** — Engineers request temporary elevated access via an approval workflow.
+---
+## CI/CD Pipeline
+### Deployment flow
+1. **Code push** — Developer pushes to a feature branch on GitHub.
+2. **CI checks** — Automated linting, type checking, unit tests, and integration tests run in GitHub Actions.
+3. **Build** — Docker images are built, scanned for vulnerabilities, and pushed to ECR.
+4. **Staging deploy** — ArgoCD deploys to the staging environment automatically.
+5. **QA validation** — Automated end-to-end tests + manual spot checks.
+6. **Production deploy** — Canary deployment to 5% of traffic, then gradual rollout to 100%.
+7. **Post-deploy monitoring** — Automated health checks verify error rates and latency for 30 minutes.
-- **Network Security**: Network-level security
-- **Application Security**: Application security
-- **Data Security**: Data protection measures
-- **Access Control**: Access control systems
+### Deployment frequency
-### Compliance
+- **Production deploys** — 8-12 per day across all services.
+- **Rollback time** — Under 60 seconds to previous version.
+- **Feature flags** — LaunchDarkly for gradual feature rollouts and instant kill switches.
-- **Security Standards**: Industry security standards
-- **Compliance Monitoring**: Compliance monitoring
-- **Audit Logging**: Comprehensive audit logs
-- **Security Testing**: Regular security testing
+---
-## Global Infrastructure
+## Global Performance
-### Geographic Distribution
+### Content delivery
-- **Multi-region**: Multi-region deployment
-- **Edge Computing**: Edge computing capabilities
-- **Latency Optimization**: Low-latency optimization
-- **Data Residency**: Data residency compliance
+- **200+ edge locations** via CloudFront CDN.
+- **Edge caching** for static assets with 85%+ cache hit rate.
+- **Dynamic content acceleration** — Optimized TCP connections and HTTP/3 support.
-### Network Architecture
+### Latency optimization
-- **Global Network**: Global network infrastructure
-- **Peering**: Network peering agreements
-- **Traffic Management**: Intelligent traffic management
-- **Bandwidth**: High-bandwidth connectivity
+| Region         | Average API latency |
+| -------------- | ------------------- |
+| US East        | 35 ms               |
+| US West        | 52 ms               |
+| EU West        | 41 ms               |
+| AP Southeast   | 68 ms               |
+| Global average | 48 ms               |

package/lib/templates/content/technical-questions/performance-optimization.md CHANGED Viewed

@@ -1,83 +1,105 @@
 # Performance Optimization
-Technical optimizations for speed and efficiency.
+How Yantra achieves sub-second response times at scale — and how you can get the most out of the platform.
-## System Performance
+---
-### Response Time Optimization
+## System Performance Targets
-- **Query Optimization**: Database query optimization
-- **Caching Strategies**: Intelligent caching
-- **CDN Integration**: Content delivery network
-- **Load Balancing**: Advanced load balancing
+| Metric                    | Target         | Current        |
+| ------------------------- | -------------- | -------------- |
+| Search latency (P50)      | < 200 ms       | 145 ms         |
+| Search latency (P99)      | < 1 s          | 780 ms         |
+| AI response latency (P50) | < 800 ms       | 620 ms         |
+| API uptime                | 99.95%         | 99.98%         |
+| Throughput                | 10,000 req/min | 14,200 req/min |
-### Throughput Optimization
+---
-- **Parallel Processing**: Multi-threaded processing
-- **Async Operations**: Asynchronous operations
-- **Connection Pooling**: Database connection pooling
-- **Resource Management**: Efficient resource management
+## Caching Strategy
+Yantra uses a **multi-layer caching architecture** to minimize redundant computation:
+### Layer 1 — Edge cache (CDN)
+Static assets and frequently requested API responses are cached at 200+ edge locations worldwide. Cache hit rates typically exceed 85%.
+### Layer 2 — Application cache (Redis)
+- **Query result cache** — Identical queries return cached results within 5 ms.
+- **Session cache** — User sessions and preferences are stored in Redis for instant access.
+- **Embedding cache** — Pre-computed embeddings for popular documents avoid redundant inference.
+### Layer 3 — Database cache
+- **Connection pooling** — PgBouncer maintains a pool of warm database connections, eliminating connection overhead.
+- **Materialized views** — Complex aggregation queries use pre-computed materialized views refreshed every 5 minutes.
+- **Query plan caching** — PostgreSQL's prepared statements reuse optimized query plans.
+---
 ## Database Optimization
-### Query Performance
+### Indexing strategy
-- **Index Optimization**: Database index optimization
-- **Query Analysis**: Query performance analysis
-- **Execution Plans**: Query execution optimization
-- **Connection Optimization**: Database connection optimization
+- **B-tree indexes** on primary keys and frequently filtered columns.
+- **GIN indexes** on JSONB fields for flexible metadata queries.
+- **Partial indexes** on hot data subsets (e.g., active users, recent content).
+- **Covering indexes** that include all queried columns to avoid table lookups.
-### Storage Optimization
+### Query optimization
-- **Data Partitioning**: Database partitioning
-- **Compression**: Data compression techniques
-- **Archiving**: Data archiving strategies
-- **Cleanup**: Automated data cleanup
+Every query is analyzed for:
-## Application Optimization
+1. **Execution plan review** — EXPLAIN ANALYZE is run on all new queries during code review.
+2. **N+1 detection** — Automated detection of N+1 query patterns in development.
+3. **Slow query logging** — Queries exceeding 100 ms are logged and reviewed weekly.
+4. **Connection management** — Read replicas handle analytics queries to keep the primary database responsive.
-### Code Optimization
+---
-- **Algorithm Optimization**: Algorithm efficiency
-- **Memory Management**: Memory optimization
-- **CPU Optimization**: CPU usage optimization
-- **I/O Optimization**: Input/output optimization
+## Application-Level Optimization
-### Framework Optimization
+### Frontend performance
-- **Framework Tuning**: Framework configuration
-- **Middleware Optimization**: Middleware performance
-- **Library Optimization**: Third-party library optimization
-- **Dependency Management**: Dependency optimization
+- **Code splitting** — Routes are lazy-loaded so users only download the JavaScript they need.
+- **Image optimization** — All images are served in WebP/AVIF format with responsive srcsets.
+- **Prefetching** — The most likely next navigation target is prefetched in the background.
+- **Bundle analysis** — We maintain strict bundle size budgets (< 200 KB initial JS, < 50 KB CSS).
-## Infrastructure Optimization
+### Backend performance
-### Server Optimization
+- **Async I/O** — All I/O-bound operations (database queries, API calls, file reads) are non-blocking.
+- **Worker pools** — CPU-intensive tasks (embedding generation, document parsing) run in dedicated worker processes.
+- **Streaming responses** — AI-generated responses stream token-by-token for perceived instant feedback.
+- **Graceful degradation** — If a dependency is slow, the system returns partial results rather than timing out entirely.
-- **Hardware Optimization**: Hardware configuration
-- **OS Tuning**: Operating system tuning
-- **Network Optimization**: Network performance
-- **Storage Optimization**: Storage performance
+---
+## Infrastructure Optimization
-### Cloud Optimization
+### Auto-scaling
-- **Resource Allocation**: Optimal resource allocation
-- **Auto-scaling**: Intelligent auto-scaling
-- **Cost Optimization**: Cloud cost optimization
-- **Performance Monitoring**: Performance monitoring
+| Component              | Scaling metric                | Scale range        |
+| ---------------------- | ----------------------------- | ------------------ |
+| API servers            | Request rate + latency        | 4 to 64 pods       |
+| AI workers             | GPU utilization + queue depth | 2 to 16 nodes      |
+| Kafka consumers        | Consumer lag                  | 3 to 24 partitions |
+| Database read replicas | Connection count              | 1 to 4 replicas    |
-## Monitoring and Profiling
+### Cost optimization
-### Performance Monitoring
+- **Spot instances** for batch processing workloads (60% cost savings).
+- **Reserved instances** for always-on services (40% cost savings).
+- **Right-sizing** — Monthly reviews of resource utilization with automated recommendations.
+- **Tiered storage** — Infrequently accessed data is automatically moved to cheaper storage tiers.
-- **Real-time Monitoring**: Real-time performance monitoring
-- **Metrics Collection**: Performance metrics collection
-- **Alerting**: Performance alerting
-- **Dashboards**: Performance dashboards
+---
-### Profiling Tools
+## Tips for Optimizing Your Experience
-- **Application Profiling**: Application performance profiling
-- **Database Profiling**: Database performance profiling
-- **Network Profiling**: Network performance profiling
-- **System Profiling**: System performance profiling
+1. **Use specific queries** — More precise queries return faster, more relevant results.
+2. **Apply filters** — Narrowing by date, source, or content type reduces the search space.
+3. **Enable caching** — For API integrations, respect cache headers to avoid unnecessary requests.
+4. **Use batch endpoints** — Bundle multiple operations into a single batch request.
+5. **Monitor your usage** — The Dashboard shows your API usage patterns and suggests optimizations.

package/lib/templates/content/technical-questions/search-algorithms.md CHANGED Viewed

@@ -1,83 +1,105 @@
-# Search Algorithms
+# Search Algorithms & Ranking
-Technical details about Yantra's search algorithms.
+How Yantra finds the most relevant results in milliseconds — the algorithms, ranking signals, and optimizations behind the search experience.
+---
 ## Search Architecture
-### Core Algorithms
+Yantra's search system combines **three complementary approaches** to deliver highly relevant results:
-- **Vector Search**: Semantic vector search
-- **Keyword Search**: Traditional keyword matching
-- **Hybrid Search**: Combined vector and keyword search
-- **Ranking Algorithms**: Advanced ranking algorithms
+### 1. Keyword search (BM25)
-### Search Pipeline
+Traditional lexical matching using the BM25 algorithm. This excels at exact-match queries where the user knows specific terms.
-- **Query Processing**: Query analysis and processing
-- **Index Lookup**: Search index lookup
-- **Result Ranking**: Result ranking and scoring
-- **Result Filtering**: Result filtering and refinement
+- **How it works** — Documents are scored based on term frequency (TF), inverse document frequency (IDF), and document length normalization.
+- **Best for** — Error codes, product names, exact phrases, technical identifiers.
-## Ranking Algorithms
+### 2. Semantic search (vector similarity)
-### Relevance Scoring
+Neural embedding-based search that understands meaning beyond exact words.
-- **TF-IDF**: Term frequency-inverse document frequency
-- **BM25**: Best Matching 25 algorithm
-- **Semantic Similarity**: Semantic similarity scoring
-- **User Behavior**: User behavior-based ranking
+- **How it works** — Both the query and documents are converted to high-dimensional vectors. Results are ranked by cosine similarity.
+- **Best for** — Natural language questions, conceptual queries, "how do I..." questions.
-### Ranking Factors
+### 3. Hybrid search (default)
-- **Content Quality**: Content quality assessment
-- **Source Authority**: Source authority scoring
-- **Recency**: Content recency scoring
-- **User Preferences**: Personalized ranking
+Combines keyword and semantic results using **Reciprocal Rank Fusion (RRF)**, producing results that are both lexically precise and semantically relevant.
-## Indexing System
+```
+RRF_score = sum(1 / (k + rank_i)) for each ranking system i
+```
-### Search Index
+This is Yantra's default search mode and performs best for most queries.
-- **Inverted Index**: Traditional inverted index
-- **Vector Index**: Semantic vector index
-- **Metadata Index**: Metadata indexing
-- **Full-text Index**: Full-text search index
+---
-### Index Management
+## Ranking Signals
-- **Index Updates**: Real-time index updates
-- **Index Optimization**: Index optimization
-- **Index Partitioning**: Index partitioning
-- **Index Backup**: Index backup and recovery
+After initial retrieval, results are re-ranked using a learned ranking model that weighs multiple signals:
-## Query Processing
+| Signal            | Weight | Description                                          |
+| ----------------- | ------ | ---------------------------------------------------- |
+| Text relevance    | 35%    | BM25 + semantic similarity combined score            |
+| Source authority  | 20%    | Domain reputation, citation count, editorial quality |
+| Content freshness | 15%    | Recency of creation or last update                   |
+| User engagement   | 15%    | Click-through rate, time-on-page, bookmarks          |
+| Personalization   | 10%    | User's search history, preferred sources, role       |
+| Content quality   | 5%     | Readability score, completeness, structure           |
+### Boosting and suppression
-### Query Analysis
+- **Recency boost** — Content published in the last 30 days receives a configurable relevance boost.
+- **Source pinning** — Admins can pin trusted sources to always appear in the top results.
+- **Duplicate suppression** — Near-duplicate documents (> 85% similarity) are collapsed, showing only the highest-quality version.
+---
+## Indexing System
-- **Query Parsing**: Natural language query parsing
-- **Query Expansion**: Query expansion techniques
-- **Query Optimization**: Query optimization
-- **Query Caching**: Query result caching
+### How content gets indexed
+1. **Document intake** — New or updated content enters the indexing queue via the data pipeline.
+2. **Analysis** — The document is processed through language-specific analyzers (tokenization, stemming, synonym expansion).
+3. **Embedding** — A vector representation is generated using the configured embedding model.
+4. **Storage** — The analyzed text, metadata, and vector are written to Elasticsearch shards.
+5. **Availability** — The document becomes searchable within 1-2 seconds (near real-time).
+### Index management
+- **Shard allocation** — Each index is distributed across multiple shards for parallel query execution.
+- **Replica configuration** — Each shard has at least one replica for fault tolerance and read throughput.
+- **Index lifecycle** — Older indexes are automatically merged, force-merged, or moved to warm storage based on access patterns.
+---
+## Query Processing
-### Intent Recognition
+### What happens when you search
-- **Intent Classification**: User intent classification
-- **Entity Recognition**: Named entity recognition
-- **Context Understanding**: Context analysis
-- **Query Disambiguation**: Query disambiguation
+1. **Query parsing** — The raw input is tokenized and analyzed using the same pipeline as indexed documents.
+2. **Query expansion** — Synonyms and related terms are added automatically (e.g., "k8s" expands to "kubernetes").
+3. **Spell correction** — Common typos are detected and corrected using edit-distance algorithms.
+4. **Intent detection** — The system classifies whether the user wants a factual answer, a list, a comparison, or a how-to guide.
+5. **Execution** — The query runs against both keyword and vector indexes in parallel.
+6. **Fusion + re-ranking** — Results are merged via RRF, then re-ranked by the learned ranking model.
+7. **Response assembly** — Top results are formatted with highlighted snippets and source metadata.
-## Performance Optimization
+---
-### Search Performance
+## Performance
-- **Response Time**: Sub-second response times
-- **Throughput**: High query throughput
-- **Scalability**: Horizontal scaling
-- **Caching**: Multi-layer caching
+| Metric                     | Value                         |
+| -------------------------- | ----------------------------- |
+| Average search latency     | 145 ms                        |
+| P99 search latency         | < 780 ms                      |
+| Index refresh interval     | 1 second                      |
+| Maximum concurrent queries | 12,000/sec                    |
+| Index size                 | 500M+ documents across 8.7 TB |
-### Optimization Techniques
+### Optimization techniques
-- **Index Optimization**: Search index optimization
-- **Query Optimization**: Query performance optimization
-- **Caching Strategies**: Intelligent caching
-- **Load Balancing**: Search load balancing
+- **Query caching** — Identical queries are served from Redis cache (5-minute TTL).
+- **Shard routing** — Queries are routed to the shard most likely to contain relevant results.
+- **Early termination** — If the top results are already confident, remaining shards stop processing.
+- **Parallel execution** — Keyword and vector searches run concurrently, not sequentially.