npm - @xdev-asia/xdev-knowledge-mcp - Versions diffs - 1.0.41 → 1.0.42 - Mend

@xdev-asia/xdev-knowledge-mcp 1.0.41 → 1.0.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/content/series/luyen-thi/luyen-thi-aws-ai-practitioner/chapters/03-domain-3-applications-foundation-models/lessons/07-bai-7-fine-tuning-model-customization.md ADDED Viewed

@@ -0,0 +1,247 @@
+---
+id: 019c9619-lt01-d3-l07
+title: 'Bài 7: Fine-tuning & Model Customization'
+slug: bai-7-fine-tuning-model-customization
+description: >-
+  Pre-training vs Fine-tuning vs RLHF. PEFT & LoRA.
+  Continued Pre-training. Amazon Bedrock Custom Models.
+  Training data preparation, evaluation, deployment.
+duration_minutes: 50
+is_free: true
+video_url: null
+sort_order: 3
+section_title: "Domain 3: Applications of Foundation Models (28%)"
+course:
+  id: 019c9619-lt01-7001-c001-lt0100000001
+  title: 'Luyện thi AWS Certified AI Practitioner (AIF-C01)'
+  slug: luyen-thi-aws-ai-practitioner
+---
+<div style="text-align: center; margin: 2rem 0;">
+<img src="/storage/uploads/2026/04/aws-aif-bai7-finetuning-spectrum.png" alt="Model Customization Spectrum" style="max-width: 800px; width: 100%; border-radius: 12px;" />
+<p><em>Model Customization Spectrum: từ Prompt Engineering đến Pre-training từ đầu</em></p>
+</div>
+<h2 id="customization-spectrum"><strong>1. Model Customization Spectrum</strong></h2>
+<p>Có nhiều cách để customize FM behavior, từ đơn giản đến phức tạp:</p>
+<pre><code class="language-text">Least Effort                                    Most Effort
+──────────────────────────────────────────────────────────
+Prompt       Few-shot      RAG       Fine-      Continued    Pre-
+Engineering  Prompting               tuning     Pre-training training
+──────────────────────────────────────────────────────────
+No training                   ←                 →    Full training
+$ cheapest                    ←                 →    $$$$ most expensive
+Minutes                       ←                 →    Weeks/Months
+</code></pre>
+<h2 id="fine-tuning"><strong>2. Fine-tuning</strong></h2>
+<p><strong>Fine-tuning</strong> = further training an existing FM on <strong>your specific dataset</strong> to improve performance on your domain/task.</p>
+<h3 id="when-fine-tune"><strong>2.1. When to Fine-tune?</strong></h3>
+<table>
+<thead><tr><th>Fine-tune When...</th><th>DON'T Fine-tune When...</th></tr></thead>
+<tbody>
+<tr><td>Need specific style, tone, or format</td><td>Just need factual Q&A (use RAG)</td></tr>
+<tr><td>Domain-specific language patterns</td><td>Task works well with prompting</td></tr>
+<tr><td>Improve accuracy on specific tasks</td><td>Don't have labeled training data</td></tr>
+<tr><td>Reduce prompt size (internalize instructions)</td><td>Data changes frequently (use RAG)</td></tr>
+<tr><td>Need consistent output format</td><td>Budget is limited</td></tr>
+</tbody>
+</table>
+<h3 id="fine-tune-types"><strong>2.2. Types of Fine-tuning</strong></h3>
+<table>
+<thead><tr><th>Type</th><th>What</th><th>Data Format</th><th>Use Case</th></tr></thead>
+<tbody>
+<tr><td><strong>Instruction fine-tuning</strong></td><td>Train on prompt-response pairs</td><td>{"prompt": "...", "completion": "..."}</td><td>Follow instructions better</td></tr>
+<tr><td><strong>Domain adaptation</strong></td><td>Train on domain text</td><td>Domain documents (medical, legal)</td><td>Learn domain terminology</td></tr>
+<tr><td><strong>Task-specific</strong></td><td>Train on specific task examples</td><td>Task input-output pairs</td><td>Classification, extraction</td></tr>
+</tbody>
+</table>
+<h2 id="peft"><strong>3. PEFT & LoRA</strong></h2>
+<h3 id="peft-overview"><strong>3.1. Parameter-Efficient Fine-Tuning (PEFT)</strong></h3>
+<p>Full fine-tuning updates <strong>ALL model parameters</strong> — expensive and needs lots of GPU memory. PEFT methods update only a <strong>small subset of parameters</strong>.</p>
+<pre><code class="language-text">Full Fine-tuning:
+  Model: 7 billion parameters
+  Updated: 7 billion parameters (100%)
+  GPU Memory: Very high
+  Cost: $$$$
+PEFT (LoRA):
+  Model: 7 billion parameters
+  Updated: ~10 million parameters (0.1%)
+  GPU Memory: Much lower
+  Cost: $$
+</code></pre>
+<h3 id="lora"><strong>3.2. LoRA (Low-Rank Adaptation)</strong></h3>
+<p>LoRA thêm <strong>small trainable matrices</strong> vào model layers thay vì update toàn bộ weights:</p>
+<ul>
+<li>Freezes original model weights</li>
+<li>Adds small "adapter" matrices (rank decomposition)</li>
+<li>Only trains these small adapters</li>
+<li>At inference: merge adapters with original weights</li>
+</ul>
+<blockquote>
+<p><strong>Exam tip:</strong> "Which technique reduces the cost of fine-tuning while maintaining quality?" → <strong>LoRA / PEFT</strong>. Key concept: train a small percentage of parameters instead of all.</p>
+</blockquote>
+<h2 id="continued-pretraining"><strong>4. Continued Pre-training</strong></h2>
+<p><strong>Continued Pre-training</strong> trains the FM on <strong>large amounts of unlabeled domain data</strong> — teaching the model new vocabulary and concepts <em>before</em> fine-tuning on task-specific data.</p>
+<pre><code class="language-text">Workflow:
+Base FM → Continued Pre-training → Fine-tuning → Evaluation
+           (domain corpus,           (labeled        (test on
+            unlabeled)                task data)       holdout)
+Example:
+Base Claude → Train on 100K medical papers → Fine-tune on
+              (continued pre-training)       medical Q&A pairs
+              Learns: medical terminology,   Learns: how to
+              drug names, procedures         answer clinical questions
+</code></pre>
+<h3 id="cpt-vs-ft"><strong>Continued Pre-training vs Fine-tuning:</strong></h3>
+<table>
+<thead><tr><th>Aspect</th><th>Continued Pre-training</th><th>Fine-tuning</th></tr></thead>
+<tbody>
+<tr><td><strong>Data</strong></td><td>Large, unlabeled domain text</td><td>Smaller, labeled task data</td></tr>
+<tr><td><strong>Goal</strong></td><td>Learn domain knowledge</td><td>Learn task-specific behavior</td></tr>
+<tr><td><strong>Cost</strong></td><td>More expensive (larger data)</td><td>Less expensive</td></tr>
+<tr><td><strong>When</strong></td><td>Model lacks domain vocabulary</td><td>Model needs to do specific tasks</td></tr>
+</tbody>
+</table>
+<h2 id="rlhf"><strong>5. RLHF (Reinforcement Learning from Human Feedback)</strong></h2>
+<p>RLHF is used to <strong>align</strong> model outputs with human preferences — making outputs more helpful, truthful, and harmless.</p>
+<pre><code class="language-text">RLHF Pipeline:
+1. Collect human feedback    2. Train reward model    3. Optimize with RL
+   "Which response is           Learns: what humans      FM generates →
+    better? A or B?"             prefer                   reward model scores →
+                                                          update FM weights
+</code></pre>
+<p>RLHF is mainly done by <strong>FM providers</strong> (Anthropic, Meta, Amazon) — not typically by end users. But you should know the concept for the exam.</p>
+<h2 id="bedrock-custom"><strong>6. Amazon Bedrock Custom Models</strong></h2>
+<p>Bedrock offers two customization approaches:</p>
+<h3 id="bedrock-ft"><strong>6.1. Fine-tuning in Bedrock</strong></h3>
+<table>
+<thead><tr><th>Feature</th><th>Detail</th></tr></thead>
+<tbody>
+<tr><td><strong>Supported models</strong></td><td>Amazon Titan, Meta Llama, Cohere</td></tr>
+<tr><td><strong>Data format</strong></td><td>JSONL with prompt-completion pairs</td></tr>
+<tr><td><strong>Data location</strong></td><td>Amazon S3</td></tr>
+<tr><td><strong>Output</strong></td><td>Custom model version in Bedrock</td></tr>
+<tr><td><strong>Provisioned Throughput</strong></td><td>Required to use fine-tuned model</td></tr>
+</tbody>
+</table>
+<h3 id="bedrock-cpt"><strong>6.2. Continued Pre-training in Bedrock</strong></h3>
+<table>
+<thead><tr><th>Feature</th><th>Detail</th></tr></thead>
+<tbody>
+<tr><td><strong>Supported models</strong></td><td>Amazon Titan, Meta Llama, Cohere</td></tr>
+<tr><td><strong>Data format</strong></td><td>Plain text files (unlabeled)</td></tr>
+<tr><td><strong>Use case</strong></td><td>Domain adaptation before fine-tuning</td></tr>
+</tbody>
+</table>
+<h3 id="bedrock-training-data"><strong>6.3. Training Data Preparation</strong></h3>
+<pre><code class="language-json">// Fine-tuning data format (JSONL):
+{"prompt": "What is the recommended dosage of Drug X?", "completion": "The recommended dosage of Drug X is 500mg twice daily for adults."}
+{"prompt": "List side effects of Drug X.", "completion": "Common side effects include headache, nausea, and dizziness."}
+</code></pre>
+<h3 id="bedrock-model-eval"><strong>6.4. Model Evaluation in Bedrock</strong></h3>
+<p>Amazon Bedrock Model Evaluation cho phép so sánh models:</p>
+<ul>
+<li><strong>Automatic evaluation</strong>: Built-in metrics (accuracy, robustness, toxicity)</li>
+<li><strong>Human evaluation</strong>: Human reviewers rate model outputs</li>
+<li><strong>Compare models</strong>: Side-by-side comparison of different FMs</li>
+</ul>
+<blockquote>
+<p><strong>Exam tip:</strong> "How to compare the quality of two foundation models for a specific use case?" → <strong>Amazon Bedrock Model Evaluation</strong>. Supports both automatic metrics and human evaluation.</p>
+</blockquote>
+<h2 id="data-prep"><strong>7. Training Data Best Practices</strong></h2>
+<table>
+<thead><tr><th>Practice</th><th>Why</th></tr></thead>
+<tbody>
+<tr><td><strong>High-quality data</strong></td><td>Garbage in = garbage out</td></tr>
+<tr><td><strong>Diverse examples</strong></td><td>Prevent overfitting to narrow patterns</td></tr>
+<tr><td><strong>Balanced classes</strong></td><td>Avoid bias toward majority class</td></tr>
+<tr><td><strong>Clean data</strong></td><td>Remove duplicates, errors, PII</td></tr>
+<tr><td><strong>Sufficient quantity</strong></td><td>Typically 1000+ for fine-tuning</td></tr>
+<tr><td><strong>Train/validation split</strong></td><td>Evaluate on unseen data</td></tr>
+<tr><td><strong>Format consistency</strong></td><td>Same structure for all examples</td></tr>
+</tbody>
+</table>
+<h2 id="summary-table"><strong>8. Summary: When to Use What</strong></h2>
+<table>
+<thead><tr><th>Scenario</th><th>Best Approach</th></tr></thead>
+<tbody>
+<tr><td>Simple task, model already good at it</td><td>Prompt Engineering</td></tr>
+<tr><td>Need model to follow a specific pattern</td><td>Few-shot Prompting</td></tr>
+<tr><td>Need answers from company documents</td><td>RAG</td></tr>
+<tr><td>Need specific style/tone/format</td><td>Fine-tuning</td></tr>
+<tr><td>Model doesn't know domain vocabulary</td><td>Continued Pre-training + Fine-tuning</td></tr>
+<tr><td>Align with human preferences</td><td>RLHF (done by FM providers)</td></tr>
+</tbody>
+</table>
+<h2 id="practice-questions"><strong>9. Practice Questions</strong></h2>
+<p><strong>Q1:</strong> A legal firm wants their AI assistant to generate legal documents in a specific firm-approved writing style. They have 5,000 examples of approved documents. Which customization approach is MOST appropriate?</p>
+<ul>
+<li>A) RAG with a knowledge base</li>
+<li>B) Zero-shot prompting</li>
+<li>C) Fine-tuning on the approved document examples ✓</li>
+<li>D) Continued pre-training on legal textbooks</li>
+</ul>
+<p><em>Explanation: Fine-tuning is ideal for teaching a model a specific writing style with labeled examples. RAG is for retrieving information, not learning styles. Continued pre-training would teach legal concepts but not the firm's specific style.</em></p>
+<p><strong>Q2:</strong> Which technique allows fine-tuning a large language model while updating only a small fraction of the model's parameters?</p>
+<ul>
+<li>A) Full fine-tuning</li>
+<li>B) LoRA (Low-Rank Adaptation) ✓</li>
+<li>C) Continued pre-training</li>
+<li>D) RLHF</li>
+</ul>
+<p><em>Explanation: LoRA is a PEFT (Parameter-Efficient Fine-Tuning) method that adds small trainable adapter matrices while freezing the original model weights — typically updating less than 1% of total parameters.</em></p>
+<p><strong>Q3:</strong> A company fine-tuned a foundation model, but the model performs well on training data and poorly on new data. What is this problem called?</p>
+<ul>
+<li>A) Underfitting</li>
+<li>B) Overfitting ✓</li>
+<li>C) High bias</li>
+<li>D) Data drift</li>
+</ul>
+<p><em>Explanation: Overfitting occurs when a model memorizes training data instead of learning general patterns. Solutions include: more training data, regularization, lower learning rate, early stopping, or data augmentation.</em></p>

package/content/series/luyen-thi/luyen-thi-aws-ai-practitioner/chapters/03-domain-3-applications-foundation-models/lessons/08-bai-8-amazon-bedrock-deep-dive.md ADDED Viewed

@@ -0,0 +1,276 @@
+---
+id: 019c9619-lt01-d3-l08
+title: 'Bài 8: Amazon Bedrock Deep Dive'
+slug: bai-8-amazon-bedrock-deep-dive
+description: >-
+  Amazon Bedrock: all features. Agents, Guardrails, Model Evaluation.
+  PartyRock playground. Amazon Q Developer & Amazon Q Business.
+  Choosing the right FM. Pricing models.
+duration_minutes: 65
+is_free: true
+video_url: null
+sort_order: 4
+section_title: "Domain 3: Applications of Foundation Models (28%)"
+course:
+  id: 019c9619-lt01-7001-c001-lt0100000001
+  title: 'Luyện thi AWS Certified AI Practitioner (AIF-C01)'
+  slug: luyen-thi-aws-ai-practitioner
+---
+<div style="text-align: center; margin: 2rem 0;">
+<img src="/storage/uploads/2026/04/aws-aif-bai8-bedrock-architecture.png" alt="Amazon Bedrock Architecture" style="max-width: 800px; width: 100%; border-radius: 12px;" />
+<p><em>Amazon Bedrock Architecture — Foundation Models, Agents, Guardrails và Knowledge Bases</em></p>
+</div>
+<h2 id="bedrock-overview"><strong>1. Amazon Bedrock Overview</strong></h2>
+<p><strong>Amazon Bedrock</strong> là fully managed service cung cấp access đến <strong>FMs từ nhiều providers</strong> qua single API, kèm theo tools để customize, deploy, và secure AI applications.</p>
+<h3 id="bedrock-key-features"><strong>1.1. Key Value Propositions</strong></h3>
+<ul>
+<li><strong>Choice</strong>: Access FMs từ Amazon, Anthropic, Meta, Mistral, Cohere, Stability AI, AI21 Labs</li>
+<li><strong>Customization</strong>: Fine-tuning, continued pre-training, RAG (Knowledge Bases)</li>
+<li><strong>Security</strong>: Data stays in your AWS account, encrypted, not used to train models</li>
+<li><strong>Serverless</strong>: No infrastructure to manage</li>
+<li><strong>Integration</strong>: Native AWS service integration (IAM, CloudWatch, CloudTrail)</li>
+</ul>
+<h3 id="fm-providers"><strong>1.2. Foundation Model Providers on Bedrock</strong></h3>
+<table>
+<thead><tr><th>Provider</th><th>Models</th><th>Strengths</th></tr></thead>
+<tbody>
+<tr><td><strong>Amazon</strong></td><td>Titan Text, Titan Embeddings, Titan Image Generator</td><td>General purpose, embeddings, image gen</td></tr>
+<tr><td><strong>Anthropic</strong></td><td>Claude 3 Haiku, Sonnet, Opus</td><td>Complex reasoning, analysis, vision</td></tr>
+<tr><td><strong>Meta</strong></td><td>Llama 2, Llama 3</td><td>Open-source, customizable</td></tr>
+<tr><td><strong>Mistral AI</strong></td><td>Mistral, Mixtral</td><td>Fast, efficient, multilingual</td></tr>
+<tr><td><strong>Cohere</strong></td><td>Command, Embed</td><td>Enterprise text, multilingual embeddings</td></tr>
+<tr><td><strong>Stability AI</strong></td><td>Stable Diffusion XL</td><td>Image generation</td></tr>
+<tr><td><strong>AI21 Labs</strong></td><td>Jurassic</td><td>Text generation, summarization</td></tr>
+</tbody>
+</table>
+<h2 id="bedrock-features"><strong>2. Bedrock Features Deep Dive</strong></h2>
+<h3 id="bedrock-agents"><strong>2.1. Amazon Bedrock Agents</strong></h3>
+<p>Agents cho phép FMs <strong>thực hiện multi-step tasks</strong> bằng cách tự động plan, execute actions, và use tools.</p>
+<pre><code class="language-text">User: "Book a flight from Hanoi to Tokyo for next Friday"
+Agent workflow:
+1. PLAN: Need to search flights, check availability, book
+2. ACTION: Call flight search API → find available flights
+3. OBSERVE: Found 3 flights, cheapest is $450
+4. ACTION: Call booking API → reserve the flight
+5. RESPOND: "Booked VN flight HAN→NRT, Dec 20, $450"
+</code></pre>
+<h3 id="agent-components"><strong>Agent Components:</strong></h3>
+<table>
+<thead><tr><th>Component</th><th>Purpose</th></tr></thead>
+<tbody>
+<tr><td><strong>Foundation Model</strong></td><td>Brain that reasons and plans</td></tr>
+<tr><td><strong>Instructions</strong></td><td>System prompt defining agent's role</td></tr>
+<tr><td><strong>Action Groups</strong></td><td>APIs the agent can call (Lambda functions or OpenAPI schemas)</td></tr>
+<tr><td><strong>Knowledge Bases</strong></td><td>RAG data sources for information retrieval</td></tr>
+<tr><td><strong>Guardrails</strong></td><td>Safety and compliance filters</td></tr>
+</tbody>
+</table>
+<blockquote>
+<p><strong>Exam tip:</strong> "An AI assistant needs to look up order status, check inventory, and process returns" → <strong>Bedrock Agent</strong> with action groups connected to business APIs.</p>
+</blockquote>
+<h3 id="bedrock-guardrails"><strong>2.2. Amazon Bedrock Guardrails</strong></h3>
+<p>Guardrails implement <strong>safety controls</strong> for AI applications:</p>
+<table>
+<thead><tr><th>Guardrail Type</th><th>What it does</th><th>Example</th></tr></thead>
+<tbody>
+<tr><td><strong>Content filters</strong></td><td>Block harmful content categories</td><td>Hate, violence, sexual, insults</td></tr>
+<tr><td><strong>Denied topics</strong></td><td>Block specific topics</td><td>"Don't discuss competitor products"</td></tr>
+<tr><td><strong>Word filters</strong></td><td>Block specific words/phrases</td><td>Profanity, banned terms</td></tr>
+<tr><td><strong>PII filters</strong></td><td>Detect and redact PII</td><td>SSN, credit card numbers, emails</td></tr>
+<tr><td><strong>Contextual grounding</strong></td><td>Check if response is grounded in context</td><td>Prevent hallucination in RAG</td></tr>
+</tbody>
+</table>
+<pre><code class="language-text">Guardrails Flow:
+User Input → [Input Guardrails] → FM Processing → [Output Guardrails] → User
+              Check for:                          Check for:
+              - Denied topics                     - Harmful content
+              - Harmful input                     - PII in response
+              - PII in input                      - Off-topic responses
+                                                  - Grounding check
+</code></pre>
+<h3 id="bedrock-eval"><strong>2.3. Model Evaluation</strong></h3>
+<p>Compare and evaluate FMs for your specific use case:</p>
+<ul>
+<li><strong>Automatic evaluation</strong>: BERTScore, accuracy, toxicity metrics</li>
+<li><strong>Human evaluation</strong>: Custom criteria rated by human reviewers</li>
+<li><strong>A/B comparison</strong>: Side-by-side model comparison</li>
+<li><strong>Custom tasks</strong>: Upload your own test dataset</li>
+</ul>
+<h3 id="bedrock-playground"><strong>2.4. Bedrock Playgrounds</strong></h3>
+<table>
+<thead><tr><th>Playground</th><th>Use Case</th></tr></thead>
+<tbody>
+<tr><td><strong>Text playground</strong></td><td>Test text models interactively</td></tr>
+<tr><td><strong>Chat playground</strong></td><td>Test conversational models</td></tr>
+<tr><td><strong>Image playground</strong></td><td>Test image generation models</td></tr>
+</tbody>
+</table>
+<h2 id="partyrock"><strong>3. Amazon PartyRock</strong></h2>
+<p><strong>PartyRock</strong> là <strong>free, no-code playground</strong> cho Bedrock — cho phép bất kỳ ai tạo GenAI apps mà không cần AWS account hay coding skills.</p>
+<table>
+<thead><tr><th>Feature</th><th>Detail</th></tr></thead>
+<tbody>
+<tr><td><strong>No AWS account needed</strong></td><td>Free to use with social login</td></tr>
+<tr><td><strong>No coding</strong></td><td>Drag-and-drop app builder</td></tr>
+<tr><td><strong>Shareable</strong></td><td>Share apps via URL</td></tr>
+<tr><td><strong>Use case</strong></td><td>Learning, prototyping, experimentation</td></tr>
+</tbody>
+</table>
+<blockquote>
+<p><strong>Exam tip:</strong> "A non-technical marketing team wants to experiment with generative AI without an AWS account" → <strong>PartyRock</strong>.</p>
+</blockquote>
+<h2 id="amazon-q"><strong>4. Amazon Q</strong></h2>
+<h3 id="q-developer"><strong>4.1. Amazon Q Developer</strong></h3>
+<p>AI coding assistant cho developers:</p>
+<ul>
+<li><strong>Code generation</strong>: Write code from natural language</li>
+<li><strong>Code explanation</strong>: Explain existing code</li>
+<li><strong>Code transformation</strong>: Upgrade Java versions, .NET migrations</li>
+<li><strong>Debugging</strong>: Identify and fix bugs</li>
+<li><strong>Security scanning</strong>: Find vulnerabilities in code</li>
+<li><strong>IDE integration</strong>: VS Code, JetBrains, AWS Console</li>
+</ul>
+<h3 id="q-business"><strong>4.2. Amazon Q Business</strong></h3>
+<p>AI assistant for business users:</p>
+<ul>
+<li><strong>Connect enterprise data</strong>: S3, SharePoint, Confluence, Salesforce, etc.</li>
+<li><strong>Q&A on company data</strong>: Answers based on connected data sources</li>
+<li><strong>Respects access controls</strong>: ACLs from connected systems</li>
+<li><strong>Plugins</strong>: Create tickets (Jira), send emails, etc.</li>
+</ul>
+<h3 id="q-vs-bedrock"><strong>4.3. Amazon Q vs Bedrock</strong></h3>
+<table>
+<thead><tr><th>Feature</th><th>Amazon Q</th><th>Amazon Bedrock</th></tr></thead>
+<tbody>
+<tr><td><strong>Target user</strong></td><td>End users (devs, business)</td><td>Developers building AI apps</td></tr>
+<tr><td><strong>Customization</strong></td><td>Limited (connect data sources)</td><td>Full (fine-tune, RAG, agents)</td></tr>
+<tr><td><strong>Managed</strong></td><td>Fully managed assistant</td><td>API/SDK access to FMs</td></tr>
+<tr><td><strong>Use case</strong></td><td>Productivity tool</td><td>Building custom AI applications</td></tr>
+</tbody>
+</table>
+<h2 id="pricing"><strong>5. Bedrock Pricing Models</strong></h2>
+<table>
+<thead><tr><th>Pricing Model</th><th>How it works</th><th>Best For</th></tr></thead>
+<tbody>
+<tr><td><strong>On-Demand</strong></td><td>Pay per input/output token</td><td>Variable, unpredictable workloads</td></tr>
+<tr><td><strong>Provisioned Throughput</strong></td><td>Reserved model units (hourly)</td><td>Consistent, production workloads</td></tr>
+<tr><td><strong>Batch Inference</strong></td><td>Submit batch jobs (up to 50% cheaper)</td><td>Large-scale, non-real-time processing</td></tr>
+</tbody>
+</table>
+<blockquote>
+<p><strong>Exam tip:</strong> "Cost-optimize a GenAI workload with predictable traffic?" → <strong>Provisioned Throughput</strong>. "Process thousands of documents overnight?" → <strong>Batch Inference</strong>.</p>
+</blockquote>
+<h2 id="choosing-fm"><strong>6. How to Choose the Right FM</strong></h2>
+<pre><code class="language-text">Decision Framework:
+┌─────────────────────────────────────────────────┐
+│ 1. TASK TYPE                                    │
+│    Text? Image? Code? Multi-modal?              │
+├─────────────────────────────────────────────────┤
+│ 2. COMPLEXITY                                   │
+│    Simple classification → smaller model         │
+│    Complex reasoning → larger model              │
+├─────────────────────────────────────────────────┤
+│ 3. LATENCY REQUIREMENTS                         │
+│    Real-time → smaller/faster model (Haiku)      │
+│    Batch processing → larger model (Opus)        │
+├─────────────────────────────────────────────────┤
+│ 4. COST CONSTRAINTS                             │
+│    Budget limited → smaller model                │
+│    Quality critical → larger model               │
+├─────────────────────────────────────────────────┤
+│ 5. CUSTOMIZATION NEEDS                          │
+│    Fine-tuning needed? Check supported models    │
+│    LoRA? Check compatibility                     │
+├─────────────────────────────────────────────────┤
+│ 6. EVALUATE with Model Evaluation               │
+│    Test candidates side-by-side                  │
+└─────────────────────────────────────────────────┘
+</code></pre>
+<h2 id="other-services"><strong>7. Other AWS GenAI Services</strong></h2>
+<table>
+<thead><tr><th>Service</th><th>What it does</th></tr></thead>
+<tbody>
+<tr><td><strong>Amazon CodeWhisperer</strong></td><td>Now part of Amazon Q Developer (code suggestions)</td></tr>
+<tr><td><strong>AWS App Studio</strong></td><td>Build enterprise apps with natural language</td></tr>
+<tr><td><strong>Amazon SageMaker JumpStart</strong></td><td>Deploy open-source FMs with SageMaker</td></tr>
+<tr><td><strong>Amazon Comprehend</strong></td><td>NLP service (sentiment, entities, topics — pre-built)</td></tr>
+<tr><td><strong>Amazon Transcribe</strong></td><td>Speech-to-text</td></tr>
+<tr><td><strong>Amazon Polly</strong></td><td>Text-to-speech</td></tr>
+<tr><td><strong>Amazon Translate</strong></td><td>Machine translation</td></tr>
+<tr><td><strong>Amazon Rekognition</strong></td><td>Image/video analysis</td></tr>
+<tr><td><strong>Amazon Textract</strong></td><td>Extract text from documents (OCR+)</td></tr>
+</tbody>
+</table>
+<h2 id="practice-questions"><strong>8. Practice Questions</strong></h2>
+<p><strong>Q1:</strong> A retail company wants to build an AI assistant that can check inventory, process returns, and answer product questions from their catalog. Which Amazon Bedrock feature should they use?</p>
+<ul>
+<li>A) Bedrock Guardrails</li>
+<li>B) Bedrock Knowledge Bases only</li>
+<li>C) Bedrock Agents with Action Groups and Knowledge Bases ✓</li>
+<li>D) Bedrock Model Evaluation</li>
+</ul>
+<p><em>Explanation: Bedrock Agents can orchestrate multi-step tasks by calling APIs (action groups for inventory/returns) and retrieving information (knowledge bases for product catalog).</em></p>
+<p><strong>Q2:</strong> Which Amazon Bedrock feature should be used to prevent a generative AI application from discussing competitor products and to filter out personally identifiable information (PII)?</p>
+<ul>
+<li>A) Bedrock Knowledge Bases</li>
+<li>B) Bedrock Custom Models</li>
+<li>C) Bedrock Guardrails ✓</li>
+<li>D) Bedrock Agents</li>
+</ul>
+<p><em>Explanation: Guardrails provide denied topic filtering (block competitor discussions) and PII detection/redaction. They can be applied to both input and output of FM calls.</em></p>
+<p><strong>Q3:</strong> A company wants to process 50,000 customer reviews overnight for sentiment analysis using a foundation model. Which Bedrock pricing model is MOST cost-effective?</p>
+<ul>
+<li>A) On-Demand pricing</li>
+<li>B) Provisioned Throughput</li>
+<li>C) Batch Inference ✓</li>
+<li>D) Free tier</li>
+</ul>
+<p><em>Explanation: Batch Inference is designed for large-scale, non-real-time workloads and offers up to 50% cost savings compared to on-demand pricing. Ideal for overnight processing.</em></p>