npm - @xdev-asia/xdev-knowledge-mcp - Versions diffs - 1.0.40 → 1.0.42 - Mend

@xdev-asia/xdev-knowledge-mcp 1.0.40 → 1.0.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/content/series/luyen-thi/luyen-thi-aws-ai-practitioner/chapters/02-domain-2-fundamentals-generative-ai/lessons/03-bai-3-generative-ai-foundation-models.md ADDED Viewed

@@ -0,0 +1,218 @@
+---
+id: 019c9619-lt01-d2-l03
+title: 'Bài 3: Generative AI & Foundation Models'
+slug: bai-3-generative-ai-foundation-models
+description: >-
+  Generative AI là gì. Foundation Models: pre-training, fine-tuning.
+  Types: text-to-text, text-to-image, text-to-code. Tokenization.
+  Model parameters, inference, temperature, top-p, top-k.
+duration_minutes: 60
+is_free: true
+video_url: null
+sort_order: 1
+section_title: "Domain 2: Fundamentals of Generative AI (24%)"
+course:
+  id: 019c9619-lt01-7001-c001-lt0100000001
+  title: 'Luyện thi AWS Certified AI Practitioner (AIF-C01)'
+  slug: luyen-thi-aws-ai-practitioner
+---
+<div style="text-align: center; margin: 2rem 0;">
+<img src="/storage/uploads/2026/04/aws-aif-bai3-foundation-model-lifecycle.png" alt="Foundation Model Lifecycle" style="max-width: 800px; width: 100%; border-radius: 12px;" />
+<p><em>Foundation Model Lifecycle — Pre-training, Fine-tuning, RAG và Prompt Engineering</em></p>
+</div>
+<h2 id="overview"><strong>Tổng quan Domain 2</strong></h2>
+<p>Domain 2 chiếm <strong>24% đề thi</strong> — đây là domain lớn thứ hai. Bạn cần hiểu rõ Generative AI, Foundation Models, và cách chúng khác biệt so với traditional ML.</p>
+<h2 id="what-is-genai"><strong>1. What is Generative AI?</strong></h2>
+<p><strong>Generative AI</strong> là nhánh của AI tập trung vào việc <strong>tạo nội dung mới</strong> (text, images, code, audio, video) dựa trên patterns học được từ training data.</p>
+<h3 id="discriminative-vs-generative"><strong>Discriminative vs Generative AI</strong></h3>
+<table>
+<thead><tr><th>Aspect</th><th>Discriminative AI</th><th>Generative AI</th></tr></thead>
+<tbody>
+<tr><td><strong>What it does</strong></td><td>Classify / predict</td><td>Create / generate</td></tr>
+<tr><td><strong>Output</strong></td><td>Label, category, number</td><td>New content (text, image, code)</td></tr>
+<tr><td><strong>Example</strong></td><td>"Is this email spam?" → Yes/No</td><td>"Write an email about..." → New email</td></tr>
+<tr><td><strong>Models</strong></td><td>Logistic Regression, SVM, CNN classifier</td><td>GPT, Claude, Stable Diffusion, DALL-E</td></tr>
+</tbody>
+</table>
+<h3 id="genai-modalities"><strong>Generative AI Modalities</strong></h3>
+<table>
+<thead><tr><th>Input → Output</th><th>Examples</th><th>Models</th></tr></thead>
+<tbody>
+<tr><td><strong>Text → Text</strong></td><td>Chatbot, summarization, translation</td><td>GPT-4, Claude, Llama</td></tr>
+<tr><td><strong>Text → Image</strong></td><td>Image generation from description</td><td>DALL-E, Stable Diffusion, Titan Image Generator</td></tr>
+<tr><td><strong>Text → Code</strong></td><td>Code generation, debugging</td><td>CodeWhisperer, Copilot</td></tr>
+<tr><td><strong>Text → Audio</strong></td><td>Speech synthesis, music generation</td><td>Amazon Polly (TTS)</td></tr>
+<tr><td><strong>Image → Text</strong></td><td>Image captioning, visual Q&A</td><td>Claude (multi-modal), GPT-4V</td></tr>
+<tr><td><strong>Audio → Text</strong></td><td>Transcription</td><td>Amazon Transcribe, Whisper</td></tr>
+</tbody>
+</table>
+<h2 id="foundation-models"><strong>2. Foundation Models</strong></h2>
+<p><strong>Foundation Model (FM)</strong> là model AI cực lớn, được <strong>pre-trained trên massive datasets</strong>, có thể adapt cho nhiều downstream tasks khác nhau.</p>
+<h3 id="fm-characteristics"><strong>Key Characteristics</strong></h3>
+<ul>
+<li><strong>Large-scale pre-training</strong>: Trained on billions of data points (text from internet, books, code)</li>
+<li><strong>General-purpose</strong>: Can handle multiple tasks without task-specific training</li>
+<li><strong>Adaptable</strong>: Can be fine-tuned or prompted for specific use cases</li>
+<li><strong>Expensive to train</strong>: Requires massive compute (GPU/TPU clusters)</li>
+<li><strong>Accessible via API</strong>: Users don't need to train — use through APIs (Amazon Bedrock)</li>
+</ul>
+<h3 id="fm-lifecycle"><strong>Foundation Model Lifecycle</strong></h3>
+<pre><code class="language-text">┌─────────────────┐     ┌──────────────┐     ┌──────────────┐
+│ 1. Pre-training │────→│ 2. Fine-     │────→│ 3. Inference │
+│ (Massive data,  │     │ tuning       │     │ (Use model   │
+│  Billion params,│     │ (Adapt to    │     │  via API or  │
+│  Very expensive)│     │  specific    │     │  endpoint)   │
+│                 │     │  domain)     │     │              │
+└─────────────────┘     └──────────────┘     └──────────────┘
+     Model Provider          You/Org              Users
+   (Anthropic, Meta,                         (Applications)
+    Amazon, etc.)
+</code></pre>
+<h2 id="tokenization"><strong>3. Tokenization</strong></h2>
+<p><strong>Tokenization</strong> là quá trình chia text thành các đơn vị nhỏ (<strong>tokens</strong>) mà model hiểu được.</p>
+<pre><code class="language-text">Input:  "Machine learning is amazing!"
+Tokens: ["Machine", " learning", " is", " amazing", "!"]
+         token_1    token_2      token_3  token_4    token_5
+OR (subword tokenization):
+Tokens: ["Mach", "ine", " learn", "ing", " is", " amaz", "ing", "!"]
+</code></pre>
+<h3 id="token-key-points"><strong>Key Concepts for Exam:</strong></h3>
+<ul>
+<li><strong>Token ≠ word</strong>: A token can be part of a word, a whole word, or punctuation</li>
+<li><strong>Context window</strong>: Maximum number of tokens a model can process at once (input + output)</li>
+<li><strong>Token limit</strong>: Determines how much text the model can "see" and generate</li>
+<li><strong>Pricing</strong>: API calls are typically priced per token (input tokens + output tokens)</li>
+</ul>
+<blockquote>
+<p><strong>Exam tip:</strong> Context window size matters. Larger context = can process longer documents. But costs more and may be slower.</p>
+</blockquote>
+<h2 id="model-parameters"><strong>4. Model Parameters & Inference Settings</strong></h2>
+<h3 id="model-params"><strong>4.1. Model Parameters (Learned during training)</strong></h3>
+<ul>
+<li><strong>Parameters</strong> = weights and biases trong neural network</li>
+<li>GPT-4: ~1.7 trillion parameters, Claude: undisclosed, Llama 3: 8B/70B/405B</li>
+<li>More parameters → generally more capable, but more expensive</li>
+</ul>
+<h3 id="inference-params"><strong>4.2. Inference Parameters (Set by user)</strong></h3>
+<p>Khi gọi model, bạn có thể điều chỉnh các <strong>inference parameters</strong>:</p>
+<table>
+<thead><tr><th>Parameter</th><th>Range</th><th>What it controls</th></tr></thead>
+<tbody>
+<tr><td><strong>Temperature</strong></td><td>0.0 → 1.0+</td><td>Randomness/creativity. Low = deterministic, focused. High = creative, diverse.</td></tr>
+<tr><td><strong>Top-p (Nucleus)</strong></td><td>0.0 → 1.0</td><td>Cumulative probability threshold. Lower = more focused vocabulary.</td></tr>
+<tr><td><strong>Top-k</strong></td><td>1 → ∞</td><td>Number of top tokens to consider. Lower = more predictable.</td></tr>
+<tr><td><strong>Max tokens</strong></td><td>1 → limit</td><td>Maximum length of generated output.</td></tr>
+<tr><td><strong>Stop sequences</strong></td><td>strings</td><td>Text that tells model to stop generating.</td></tr>
+</tbody>
+</table>
+<h3 id="temperature-guide"><strong>Temperature Guide for Exam</strong></h3>
+<pre><code class="language-text">Temperature = 0  →  Most deterministic (factual Q&A, code, data extraction)
+Temperature = 0.3 → Slightly creative (business writing, summaries)
+Temperature = 0.7 → Creative (stories, brainstorming, marketing copy)
+Temperature = 1.0+ → Very random (poetry, creative writing — may hallucinate more)
+</code></pre>
+<blockquote>
+<p><strong>Exam tip:</strong> "A company needs consistent, accurate answers for customer FAQ" → use <strong>low temperature</strong>. "A company wants creative marketing slogans" → use <strong>high temperature</strong>.</p>
+</blockquote>
+<h2 id="hallucination"><strong>5. Hallucination</strong></h2>
+<p><strong>Hallucination</strong> là khi model tạo ra output <strong>confident nhưng incorrect</strong> — bịa ra facts, citations, hoặc thông tin không tồn tại.</p>
+<h3 id="hallucination-causes"><strong>Causes:</strong></h3>
+<ul>
+<li>Training data gaps or outdated information</li>
+<li>Model doesn't truly "know" facts — it predicts likely next tokens</li>
+<li>Ambiguous or too-open prompts</li>
+<li>High temperature settings</li>
+</ul>
+<h3 id="hallucination-mitigation"><strong>Mitigation Strategies:</strong></h3>
+<table>
+<thead><tr><th>Strategy</th><th>How it helps</th></tr></thead>
+<tbody>
+<tr><td><strong>RAG</strong> (Retrieval-Augmented Generation)</td><td>Ground responses in actual data from knowledge base</td></tr>
+<tr><td><strong>Lower temperature</strong></td><td>Reduce randomness in generation</td></tr>
+<tr><td><strong>Guardrails</strong></td><td>Filter/validate outputs (Amazon Bedrock Guardrails)</td></tr>
+<tr><td><strong>Better prompts</strong></td><td>"Only answer based on provided context" / "Say I don't know if unsure"</td></tr>
+<tr><td><strong>Fine-tuning</strong></td><td>Train model on domain-specific accurate data</td></tr>
+<tr><td><strong>Human review</strong></td><td>Human-in-the-loop validation</td></tr>
+</tbody>
+</table>
+<h2 id="fm-on-aws"><strong>6. Foundation Models on AWS (Amazon Bedrock)</strong></h2>
+<p>Amazon Bedrock cung cấp access đến nhiều Foundation Models từ các providers:</p>
+<table>
+<thead><tr><th>Provider</th><th>Models</th><th>Strengths</th></tr></thead>
+<tbody>
+<tr><td><strong>Anthropic</strong></td><td>Claude 3 (Haiku, Sonnet, Opus)</td><td>Reasoning, safety, long context</td></tr>
+<tr><td><strong>Meta</strong></td><td>Llama 3</td><td>Open-source, versatile</td></tr>
+<tr><td><strong>Amazon</strong></td><td>Titan (Text, Embeddings, Image)</td><td>AWS-native, embeddings for RAG</td></tr>
+<tr><td><strong>Mistral AI</strong></td><td>Mistral, Mixtral</td><td>Efficient, fast inference</td></tr>
+<tr><td><strong>Stability AI</strong></td><td>Stable Diffusion</td><td>Image generation</td></tr>
+<tr><td><strong>Cohere</strong></td><td>Command, Embed</td><td>Enterprise NLP, embeddings</td></tr>
+<tr><td><strong>AI21 Labs</strong></td><td>Jurassic</td><td>Text generation</td></tr>
+</tbody>
+</table>
+<h2 id="practice-questions"><strong>7. Practice Questions</strong></h2>
+<p><strong>Q1:</strong> What is the PRIMARY advantage of Foundation Models compared to traditional ML models?</p>
+<ul>
+<li>A) They are smaller and faster</li>
+<li>B) They can be adapted to multiple downstream tasks without task-specific training ✓</li>
+<li>C) They never produce incorrect outputs</li>
+<li>D) They don't require any compute resources</li>
+</ul>
+<p><em>Explanation: Foundation Models are pre-trained on massive datasets and can be adapted (via prompting or fine-tuning) for many different tasks. They are large, can hallucinate, and still require compute.</em></p>
+<p><strong>Q2:</strong> A company uses a generative AI model and notices it sometimes generates plausible but factually incorrect information. What is this phenomenon called?</p>
+<ul>
+<li>A) Overfitting</li>
+<li>B) Data drift</li>
+<li>C) Hallucination ✓</li>
+<li>D) Bias</li>
+</ul>
+<p><em>Explanation: Hallucination is when a generative AI model produces confident but factually incorrect outputs.</em></p>
+<p><strong>Q3:</strong> A developer wants to ensure their generative AI chatbot provides consistent, factual answers with minimal creativity. Which inference parameter should they adjust?</p>
+<ul>
+<li>A) Set max tokens to a very high value</li>
+<li>B) Set temperature close to 0 ✓</li>
+<li>C) Set temperature close to 1</li>
+<li>D) Increase the top-k value</li>
+</ul>
+<p><em>Explanation: Low temperature makes the model more deterministic and focused, reducing creativity and randomness in responses.</em></p>

package/content/series/luyen-thi/luyen-thi-aws-ai-practitioner/chapters/02-domain-2-fundamentals-generative-ai/lessons/04-bai-4-llm-transformers-multimodal.md ADDED Viewed

@@ -0,0 +1,232 @@
+---
+id: 019c9619-lt01-d2-l04
+title: 'Bài 4: LLMs, Transformers & Multi-modal Models'
+slug: bai-4-llm-transformers-multimodal
+description: >-
+  Transformer architecture: attention mechanism, self-attention.
+  GPT (decoder-only), BERT (encoder-only), T5 (encoder-decoder).
+  Multi-modal models. Hallucination: causes and mitigation.
+  Embeddings và vector representations.
+duration_minutes: 60
+is_free: true
+video_url: null
+sort_order: 2
+section_title: "Domain 2: Fundamentals of Generative AI (24%)"
+course:
+  id: 019c9619-lt01-7001-c001-lt0100000001
+  title: 'Luyện thi AWS Certified AI Practitioner (AIF-C01)'
+  slug: luyen-thi-aws-ai-practitioner
+---
+<div style="text-align: center; margin: 2rem 0;">
+<img src="/storage/uploads/2026/04/aws-aif-bai4-transformer-architecture.png" alt="Transformer Architecture" style="max-width: 800px; width: 100%; border-radius: 12px;" />
+<p><em>Transformer Architecture — Encoder stack, Decoder stack và các biến thể BERT/GPT/T5</em></p>
+</div>
+<h2 id="transformer"><strong>1. Transformer Architecture</strong></h2>
+<p>Transformer là kiến trúc neural network đã <strong>cách mạng hoá NLP</strong>, được giới thiệu trong paper "Attention Is All You Need" (2017). Hầu hết LLMs hiện tại đều dựa trên Transformer.</p>
+<h3 id="attention"><strong>1.1. Self-Attention Mechanism</strong></h3>
+<p>Self-attention cho phép model xem xét <strong>mối quan hệ giữa tất cả các từ</strong> trong input, bất kể khoảng cách.</p>
+<pre><code class="language-text">Input: "The cat sat on the mat because it was tired"
+Self-attention answers: What does "it" refer to?
+→ Attends to "cat" (high attention score)
+→ Not "mat" (low attention score)
+Traditional RNN would struggle with this long-range dependency.
+</code></pre>
+<h3 id="encoder-decoder"><strong>1.2. Encoder-Decoder Architecture</strong></h3>
+<pre><code class="language-text">Original Transformer:
+┌──────────────────────────┐
+│        ENCODER           │  ← Understands input
+│  (Self-Attention +       │
+│   Feed-Forward layers)   │
+├──────────────────────────┤
+│        DECODER           │  ← Generates output
+│  (Masked Self-Attention +│
+│   Cross-Attention +      │
+│   Feed-Forward layers)   │
+└──────────────────────────┘
+</code></pre>
+<h3 id="transformer-types"><strong>1.3. Three Types of Transformers</strong></h3>
+<table>
+<thead><tr><th>Type</th><th>Architecture</th><th>Best For</th><th>Models</th></tr></thead>
+<tbody>
+<tr><td><strong>Encoder-only</strong></td><td>Encoder</td><td>Understanding text (classification, NER, sentiment)</td><td>BERT, RoBERTa, DistilBERT</td></tr>
+<tr><td><strong>Decoder-only</strong></td><td>Decoder</td><td>Generating text (chatbot, content creation)</td><td>GPT-4, Claude, Llama</td></tr>
+<tr><td><strong>Encoder-Decoder</strong></td><td>Both</td><td>Sequence-to-sequence (translation, summarization)</td><td>T5, BART</td></tr>
+</tbody>
+</table>
+<blockquote>
+<p><strong>Exam tip:</strong> "Which architecture is best for text generation?" → <strong>Decoder-only</strong> (GPT, Claude). "Which architecture is best for text classification?" → <strong>Encoder-only</strong> (BERT).</p>
+</blockquote>
+<h2 id="llm"><strong>2. Large Language Models (LLMs)</strong></h2>
+<p>LLMs là Foundation Models specifically для text — trained on massive text corpora to understand and generate human language.</p>
+<h3 id="llm-capabilities"><strong>2.1. LLM Capabilities</strong></h3>
+<table>
+<thead><tr><th>Capability</th><th>Description</th><th>Example</th></tr></thead>
+<tbody>
+<tr><td><strong>Text Generation</strong></td><td>Create new text content</td><td>Articles, emails, stories</td></tr>
+<tr><td><strong>Summarization</strong></td><td>Condense long text</td><td>Document summaries</td></tr>
+<tr><td><strong>Translation</strong></td><td>Convert between languages</td><td>English → Vietnamese</td></tr>
+<tr><td><strong>Q&A</strong></td><td>Answer questions</td><td>Customer support, FAQ</td></tr>
+<tr><td><strong>Code Generation</strong></td><td>Write and explain code</td><td>Amazon Q Developer</td></tr>
+<tr><td><strong>Text Classification</strong></td><td>Categorize text</td><td>Sentiment analysis</td></tr>
+<tr><td><strong>Reasoning</strong></td><td>Logical analysis</td><td>Math problems, step-by-step reasoning</td></tr>
+</tbody>
+</table>
+<h3 id="llm-limitations"><strong>2.2. LLM Limitations</strong></h3>
+<ul>
+<li><strong>Knowledge cutoff</strong>: Doesn't know events after training data cutoff date</li>
+<li><strong>Hallucination</strong>: Can generate false information confidently</li>
+<li><strong>Context window limit</strong>: Can't process unlimited text</li>
+<li><strong>No real-time data</strong>: Can't access internet or live data (unless augmented)</li>
+<li><strong>Expensive</strong>: Large models need significant compute for inference</li>
+<li><strong>Bias</strong>: Can reflect biases in training data</li>
+</ul>
+<h2 id="embeddings"><strong>3. Embeddings & Vector Representations</strong></h2>
+<p><strong>Embeddings</strong> biến text (hoặc images, audio) thành <strong>numerical vectors</strong> mà machines hiểu được. Các text có ý nghĩa tương tự sẽ có vectors gần nhau trong không gian nhiều chiều.</p>
+<pre><code class="language-text">Text: "King"     → [0.23, 0.87, -0.12, 0.45, ...]
+Text: "Queen"    → [0.21, 0.89, -0.15, 0.43, ...]  ← Close vectors!
+Text: "Banana"   → [0.91, -0.32, 0.67, -0.88, ...] ← Far away
+Relationship: King - Man + Woman ≈ Queen
+</code></pre>
+<h3 id="embeddings-use"><strong>Why Embeddings Matter for the Exam:</strong></h3>
+<ul>
+<li><strong>Semantic search</strong>: Find similar documents based on meaning (not just keywords)</li>
+<li><strong>RAG</strong>: Convert documents to embeddings, store in vector DB, retrieve relevant context</li>
+<li><strong>Clustering</strong>: Group similar documents/sentences</li>
+<li><strong>Amazon Titan Embeddings</strong>: AWS model specifically for creating text embeddings</li>
+</ul>
+<h3 id="vector-db"><strong>Vector Databases</strong></h3>
+<p>Store and search embeddings efficiently:</p>
+<table>
+<thead><tr><th>Vector DB</th><th>Notes</th></tr></thead>
+<tbody>
+<tr><td><strong>Amazon OpenSearch Serverless</strong></td><td>AWS-managed vector search</td></tr>
+<tr><td><strong>Amazon Aurora (pgvector)</strong></td><td>PostgreSQL with vector extension</td></tr>
+<tr><td><strong>Pinecone</strong></td><td>Popular third-party vector DB</td></tr>
+<tr><td><strong>Amazon Bedrock Knowledge Bases</strong></td><td>Managed RAG — handles vector storage internally</td></tr>
+</tbody>
+</table>
+<h2 id="multimodal"><strong>4. Multi-modal Models</strong></h2>
+<p><strong>Multi-modal models</strong> có thể xử lý và tạo nội dung từ <strong>nhiều loại data types</strong> (text + images + audio + video).</p>
+<h3 id="multimodal-examples"><strong>Examples on AWS:</strong></h3>
+<table>
+<thead><tr><th>Model</th><th>Modalities</th><th>What it can do</th></tr></thead>
+<tbody>
+<tr><td><strong>Claude 3</strong> (Anthropic)</td><td>Text + Image input → Text output</td><td>Describe images, analyze charts, visual Q&A</td></tr>
+<tr><td><strong>Amazon Titan Image Generator</strong></td><td>Text → Image</td><td>Create images from text descriptions</td></tr>
+<tr><td><strong>Amazon Titan Multimodal Embeddings</strong></td><td>Text + Image → Vectors</td><td>Search across text and images</td></tr>
+<tr><td><strong>Stable Diffusion</strong> (Stability AI)</td><td>Text → Image</td><td>Generate and edit images</td></tr>
+</tbody>
+</table>
+<h3 id="multimodal-usecases"><strong>Multi-modal Use Cases for Exam:</strong></h3>
+<ul>
+<li>"Analyze product images and generate descriptions" → Multi-modal model (Claude 3 Vision)</li>
+<li>"Generate product images from text descriptions" → Text-to-image (Titan Image Generator, Stable Diffusion)</li>
+<li>"Search across both text documents and images" → Multi-modal embeddings</li>
+</ul>
+<h2 id="diffusion"><strong>5. Diffusion Models</strong></h2>
+<p>Diffusion models (như Stable Diffusion) hoạt động bằng cách:</p>
+<ol>
+<li><strong>Forward process</strong>: Gradually add noise to an image until it becomes pure noise</li>
+<li><strong>Reverse process</strong>: Learn to remove noise step by step, generating a new image</li>
+</ol>
+<pre><code class="language-text">Training (Forward):
+Clean Image → Add Noise → Add More Noise → ... → Pure Noise
+Generation (Reverse):
+Pure Noise → Remove Noise → Remove More Noise → ... → New Image
+                           (guided by text prompt)
+</code></pre>
+<blockquote>
+<p><strong>Exam tip:</strong> Bạn không cần biết math chi tiết, chỉ cần hiểu concept: diffusion models tạo images bằng cách <strong>từ từ khử noise có guided bởi text prompt</strong>.</p>
+</blockquote>
+<h2 id="training-types"><strong>6. Pre-training vs Fine-tuning vs Prompting</strong></h2>
+<table>
+<thead><tr><th>Method</th><th>What</th><th>Data Needed</th><th>Cost</th><th>When to Use</th></tr></thead>
+<tbody>
+<tr><td><strong>Pre-training</strong></td><td>Train from scratch</td><td>Billions of examples</td><td>$$$$</td><td>Creating new FM (done by providers)</td></tr>
+<tr><td><strong>Fine-tuning</strong></td><td>Further train existing FM</td><td>Thousands of examples</td><td>$$</td><td>Domain-specific knowledge</td></tr>
+<tr><td><strong>Prompt Engineering</strong></td><td>Craft better inputs</td><td>None (few examples)</td><td>$</td><td>Quick adaptation, no training needed</td></tr>
+<tr><td><strong>RAG</strong></td><td>Augment with external data</td><td>Knowledge base</td><td>$</td><td>Access current/proprietary data</td></tr>
+</tbody>
+</table>
+<h3 id="decision-tree"><strong>Decision Tree for Exam:</strong></h3>
+<pre><code class="language-text">Need the model to know specific domain knowledge?
+├── Is the knowledge in documents you can provide?
+│   └── YES → RAG (Bedrock Knowledge Bases)
+│   └── NO, model needs to learn patterns →
+│       ├── Have thousands of training examples? → Fine-tuning
+│       └── Only a few examples? → Few-shot prompting
+├── General knowledge is enough? → Prompt Engineering (zero/few-shot)
+</code></pre>
+<h2 id="practice-questions"><strong>7. Practice Questions</strong></h2>
+<p><strong>Q1:</strong> A company wants to search for relevant information across both product images and text descriptions. Which type of model would be MOST suitable?</p>
+<ul>
+<li>A) A text-only LLM</li>
+<li>B) A multi-modal embedding model ✓</li>
+<li>C) A diffusion model</li>
+<li>D) A RNN model</li>
+</ul>
+<p><em>Explanation: Multi-modal embedding models can create vector representations of both text and images in the same vector space, enabling cross-modal search.</em></p>
+<p><strong>Q2:</strong> Which Transformer architecture is BEST suited for text generation tasks such as chatbots and content creation?</p>
+<ul>
+<li>A) Encoder-only (BERT)</li>
+<li>B) Decoder-only (GPT, Claude) ✓</li>
+<li>C) Encoder-decoder (T5)</li>
+<li>D) Convolutional Neural Network (CNN)</li>
+</ul>
+<p><em>Explanation: Decoder-only architectures generate text one token at a time (autoregressive) and are the basis for most modern chatbots and text generators.</em></p>
+<p><strong>Q3:</strong> What is the purpose of text embeddings in the context of generative AI applications?</p>
+<ul>
+<li>A) To compress files for storage</li>
+<li>B) To convert text into numerical vectors that capture semantic meaning ✓</li>
+<li>C) To encrypt text for security</li>
+<li>D) To translate text between languages</li>
+</ul>
+<p><em>Explanation: Embeddings are numerical vector representations of text that capture semantic meaning. Similar texts have similar vectors, enabling semantic search, RAG, and clustering.</em></p>