@xdev-asia/xdev-knowledge-mcp 1.0.43 → 1.0.44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. package/content/pages/xoa-du-lieu-nguoi-dung.md +68 -0
  2. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/01-phan-1-data-engineering/lessons/01-bai-1-data-repositories-ingestion.md +5 -0
  3. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/01-phan-1-data-engineering/lessons/02-bai-2-data-transformation.md +5 -0
  4. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/01-phan-1-data-engineering/lessons/03-bai-3-data-analysis.md +159 -0
  5. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/02-phan-2-modeling/lessons/04-bai-4-sagemaker-built-in-algorithms.md +186 -0
  6. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/02-phan-2-modeling/lessons/05-bai-5-training-hyperparameter-tuning.md +159 -0
  7. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/02-phan-2-modeling/lessons/06-bai-6-model-evaluation.md +169 -0
  8. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/03-phan-3-implementation-operations/lessons/07-bai-7-model-deployment.md +193 -0
  9. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/03-phan-3-implementation-operations/lessons/08-bai-8-model-monitoring-mlops.md +184 -0
  10. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/03-phan-3-implementation-operations/lessons/09-bai-9-security-cost.md +166 -0
  11. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/04-phan-4-on-tap/lessons/10-bai-10-bai-toan-thuong-gap.md +181 -0
  12. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/04-phan-4-on-tap/lessons/11-bai-11-cheat-sheet.md +110 -0
  13. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/chapters/04-phan-4-on-tap/lessons/12-bai-12-chien-luoc-thi.md +113 -0
  14. package/content/series/luyen-thi/luyen-thi-aws-ml-specialty/index.md +1 -1
  15. package/content/series/luyen-thi/luyen-thi-cka/index.md +217 -0
  16. package/content/series/luyen-thi/luyen-thi-ckad/index.md +199 -0
  17. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/01-phan-1-problem-framing/lessons/01-bai-1-framing-ml-problems.md +136 -0
  18. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/01-phan-1-problem-framing/lessons/02-bai-2-gcp-ai-ml-ecosystem.md +160 -0
  19. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/02-phan-2-data-engineering/lessons/03-bai-3-data-pipeline.md +174 -0
  20. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/02-phan-2-data-engineering/lessons/04-bai-4-feature-engineering.md +156 -0
  21. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/03-phan-3-model-development/lessons/05-bai-5-vertex-ai-training.md +155 -0
  22. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/03-phan-3-model-development/lessons/06-bai-6-bigquery-ml-tensorflow.md +141 -0
  23. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/04-phan-4-deployment-mlops/lessons/07-bai-7-model-deployment.md +134 -0
  24. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/04-phan-4-deployment-mlops/lessons/08-bai-8-vertex-ai-pipelines-mlops.md +149 -0
  25. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/05-phan-5-responsible-ai/lessons/09-bai-9-responsible-ai.md +128 -0
  26. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/chapters/05-phan-5-responsible-ai/lessons/10-bai-10-cheat-sheet-chien-luoc-thi.md +108 -0
  27. package/content/series/luyen-thi/luyen-thi-gcp-ml-engineer/index.md +1 -1
  28. package/content/series/luyen-thi/luyen-thi-kcna/index.md +168 -0
  29. package/package.json +1 -1
@@ -0,0 +1,217 @@
1
+ ---
2
+ id: lt-cka-series-001
3
+ title: "Luyện thi CKA — Certified Kubernetes Administrator"
4
+ slug: luyen-thi-cka
5
+ description: >-
6
+ Lộ trình ôn tập toàn diện cho kỳ thi CKA (Certified Kubernetes Administrator).
7
+ Bao phủ đầy đủ 5 domain hands-on: Troubleshooting (30%), Cluster Architecture (25%),
8
+ Services & Networking (20%), Workloads & Scheduling (15%), Storage (10%).
9
+ 12 bài học kèm bài tập thực hành terminal.
10
+
11
+ featured_image: null
12
+ level: intermediate
13
+ duration_hours: 35
14
+ lesson_count: 12
15
+ price: '0.00'
16
+ is_free: true
17
+ view_count: 0
18
+ average_rating: '0.00'
19
+ review_count: 0
20
+ enrollment_count: 0
21
+ meta: null
22
+ published_at: '2026-04-05T10:00:00.000000Z'
23
+ created_at: '2026-04-05T10:00:00.000000Z'
24
+
25
+ author:
26
+ id: 019c9616-d2b4-713f-9b2c-40e2e92a05cf
27
+ name: Duy Tran
28
+ avatar: avatars/7e8eb5c6-4cac-455b-a701-4060f085d501.jpeg
29
+
30
+ category:
31
+ id: 019c9616-cat9-7009-a009-000000000009
32
+ name: Luyện thi chứng chỉ
33
+ slug: luyen-thi
34
+
35
+ tags:
36
+
37
+ - name: Kubernetes
38
+ slug: kubernetes
39
+ - name: CKA
40
+ slug: cka
41
+ - name: CNCF
42
+ slug: cncf
43
+ - name: Chứng chỉ
44
+ slug: chung-chi
45
+ - name: DevOps
46
+ slug: devops
47
+ - name: Linux Foundation
48
+ slug: linux-foundation
49
+
50
+ quiz_slug: cka
51
+
52
+ sections:
53
+
54
+ - id: cka-section-01
55
+ title: "Domain 1: Cluster Architecture, Installation & Configuration (25%)"
56
+ description: kubeadm, RBAC, cluster upgrade, etcd backup/restore
57
+ sort_order: 1
58
+ lessons:
59
+ - id: cka-d1-l01
60
+ title: "Bài 1: Kubernetes Architecture & kubeadm Cluster Setup"
61
+ slug: 01-kien-truc-cka-kubeadm
62
+ description: >-
63
+ Control plane components sâu. kubeadm init, join, config.
64
+ High Availability cluster topology. Certificate rotation.
65
+ kubectl config, contexts, kubeconfig.
66
+ duration_minutes: 65
67
+ is_free: true
68
+ sort_order: 1
69
+ video_url: null
70
+ - id: cka-d1-l02
71
+ title: "Bài 2: Cluster Upgrade với kubeadm"
72
+ slug: 02-cluster-upgrade-kubeadm
73
+ description: >-
74
+ Upgrade strategy: control plane → worker nodes.
75
+ drain, cordon, uncordon. Version skew policy.
76
+ kubeadm upgrade plan/apply. Rollback procedures.
77
+ duration_minutes: 60
78
+ is_free: true
79
+ sort_order: 2
80
+ video_url: null
81
+ - id: cka-d1-l03
82
+ title: "Bài 3: RBAC — Role-Based Access Control"
83
+ slug: 03-rbac-cka
84
+ description: >-
85
+ Role vs ClusterRole. RoleBinding vs ClusterRoleBinding.
86
+ ServiceAccounts và token. kubectl auth can-i.
87
+ Aggregated ClusterRoles. Least privilege patterns.
88
+ duration_minutes: 60
89
+ is_free: true
90
+ sort_order: 3
91
+ video_url: null
92
+
93
+ - id: cka-section-02
94
+ title: "Domain 2: Workloads & Scheduling (15%)"
95
+ description: Deployments, DaemonSets, scheduling, taints/tolerations, affinity
96
+ sort_order: 2
97
+ lessons:
98
+ - id: cka-d2-l01
99
+ title: "Bài 4: Deployments, DaemonSets & StatefulSets"
100
+ slug: 04-deployments-daemonsets-statefulsets
101
+ description: >-
102
+ Rolling update strategies. Rollback. ReplicaSet vs Deployment.
103
+ DaemonSet use cases. StatefulSet: headless service, volumeClaimTemplates.
104
+ Resource requests & limits. QoS classes.
105
+ duration_minutes: 60
106
+ is_free: true
107
+ sort_order: 4
108
+ video_url: null
109
+ - id: cka-d2-l02
110
+ title: "Bài 5: Scheduling — Taints, Tolerations & Affinity"
111
+ slug: 05-scheduling-taints-affinity
112
+ description: >-
113
+ Manual scheduling (nodeName, nodeSelector). Node Affinity/Anti-affinity.
114
+ Pod Affinity/Anti-affinity. Taints & Tolerations.
115
+ Priority Classes. Resource quotas & LimitRanges.
116
+ duration_minutes: 60
117
+ is_free: true
118
+ sort_order: 5
119
+ video_url: null
120
+
121
+ - id: cka-section-03
122
+ title: "Domain 3: Services & Networking (20%)"
123
+ description: Services, DNS, Ingress, NetworkPolicies, CNI
124
+ sort_order: 3
125
+ lessons:
126
+ - id: cka-d3-l01
127
+ title: "Bài 6: Services, Endpoints & CoreDNS"
128
+ slug: 06-services-endpoints-coredns
129
+ description: >-
130
+ ClusterIP, NodePort, LoadBalancer, ExternalName, Headless.
131
+ Endpoints & EndpointSlices. CoreDNS configuration.
132
+ Service discovery patterns. kube-proxy modes (iptables, IPVS).
133
+ duration_minutes: 60
134
+ is_free: true
135
+ sort_order: 6
136
+ video_url: null
137
+ - id: cka-d3-l02
138
+ title: "Bài 7: Ingress, Network Policies & CNI"
139
+ slug: 07-ingress-networkpolicies-cni
140
+ description: >-
141
+ Ingress controllers (nginx). Ingress rules, TLS.
142
+ NetworkPolicy: ingress/egress rules, label selectors.
143
+ CNI plugins: Calico, Flannel, Cilium overview.
144
+ Pod CIDR vs Service CIDR.
145
+ duration_minutes: 65
146
+ is_free: true
147
+ sort_order: 7
148
+ video_url: null
149
+
150
+ - id: cka-section-04
151
+ title: "Domain 4: Storage (10%)"
152
+ description: PV, PVC, StorageClass, volumes
153
+ sort_order: 4
154
+ lessons:
155
+ - id: cka-d4-l01
156
+ title: "Bài 8: Persistent Volumes, PVCs & StorageClass"
157
+ slug: 08-persistent-volumes-storageclass
158
+ description: >-
159
+ PersistentVolume, PersistentVolumeClaim lifecycle. Access modes.
160
+ Reclaim policies. StorageClass & dynamic provisioning.
161
+ hostPath, emptyDir, NFS. VolumeMount vs VolumeFrom.
162
+ duration_minutes: 55
163
+ is_free: true
164
+ sort_order: 8
165
+ video_url: null
166
+
167
+ - id: cka-section-05
168
+ title: "Domain 5: Troubleshooting (30%)"
169
+ description: Node, workload, networking troubleshooting, etcd backup
170
+ sort_order: 5
171
+ lessons:
172
+ - id: cka-d5-l01
173
+ title: "Bài 9: etcd Backup & Restore"
174
+ slug: 09-etcd-backup-restore
175
+ description: >-
176
+ etcd architecture. etcdctl snapshot save/restore.
177
+ Environment variables: ETCDCTL_API, certificates.
178
+ Static pod manifest cho etcd. Recovery procedures.
179
+ Encryption at rest configuration.
180
+ duration_minutes: 55
181
+ is_free: true
182
+ sort_order: 9
183
+ video_url: null
184
+ - id: cka-d5-l02
185
+ title: "Bài 10: Troubleshooting Nodes & Cluster"
186
+ slug: 10-troubleshooting-nodes
187
+ description: >-
188
+ Node NotReady: kubelet, container runtime issues.
189
+ journalctl, systemctl debugging. Certificate issues.
190
+ Control plane component failures. Static pods.
191
+ kubectl describe node, kubectl get events.
192
+ duration_minutes: 60
193
+ is_free: true
194
+ sort_order: 10
195
+ video_url: null
196
+ - id: cka-d5-l03
197
+ title: "Bài 11: Troubleshooting Workloads"
198
+ slug: 11-troubleshooting-workloads
199
+ description: >-
200
+ Pod stuck states: Pending, CrashLoopBackOff, ImagePullBackOff, OOMKilled.
201
+ kubectl logs, exec, describe. Init container debugging.
202
+ Resource constraints. Liveness/Readiness probe failures.
203
+ duration_minutes: 60
204
+ is_free: true
205
+ sort_order: 11
206
+ video_url: null
207
+ - id: cka-d5-l04
208
+ title: "Bài 12: Troubleshooting Networking & Exam Strategy"
209
+ slug: 12-troubleshooting-networking-exam
210
+ description: >-
211
+ DNS resolution failures. Service not reachable. kube-proxy issues.
212
+ Network policy blocking traffic. nslookup, curl debugging.
213
+ CKA exam tips: time management, kubectl shortcuts, imperative commands.
214
+ duration_minutes: 60
215
+ is_free: true
216
+ sort_order: 12
217
+ video_url: null
@@ -0,0 +1,199 @@
1
+ ---
2
+ id: lt-ckad-series-001
3
+ title: "Luyện thi CKAD — Certified Kubernetes Application Developer"
4
+ slug: luyen-thi-ckad
5
+ description: >-
6
+ Lộ trình ôn tập toàn diện cho kỳ thi CKAD (Certified Kubernetes Application Developer).
7
+ Bao phủ đầy đủ 5 domain hands-on: App Environment & Security (25%), App Design & Build (20%),
8
+ App Deployment (20%), Services & Networking (20%), App Observability (15%).
9
+ 10 bài học kèm bài tập thực hành terminal.
10
+
11
+ featured_image: null
12
+ level: intermediate
13
+ duration_hours: 28
14
+ lesson_count: 10
15
+ price: '0.00'
16
+ is_free: true
17
+ view_count: 0
18
+ average_rating: '0.00'
19
+ review_count: 0
20
+ enrollment_count: 0
21
+ meta: null
22
+ published_at: '2026-04-05T10:00:00.000000Z'
23
+ created_at: '2026-04-05T10:00:00.000000Z'
24
+
25
+ author:
26
+ id: 019c9616-d2b4-713f-9b2c-40e2e92a05cf
27
+ name: Duy Tran
28
+ avatar: avatars/7e8eb5c6-4cac-455b-a701-4060f085d501.jpeg
29
+
30
+ category:
31
+ id: 019c9616-cat9-7009-a009-000000000009
32
+ name: Luyện thi chứng chỉ
33
+ slug: luyen-thi
34
+
35
+ tags:
36
+
37
+ - name: Kubernetes
38
+ slug: kubernetes
39
+ - name: CKAD
40
+ slug: ckad
41
+ - name: CNCF
42
+ slug: cncf
43
+ - name: Chứng chỉ
44
+ slug: chung-chi
45
+ - name: DevOps
46
+ slug: devops
47
+ - name: Linux Foundation
48
+ slug: linux-foundation
49
+
50
+ quiz_slug: ckad
51
+
52
+ sections:
53
+
54
+ - id: ckad-section-01
55
+ title: "Domain 1: Application Design and Build (20%)"
56
+ description: Multi-container pods, init containers, jobs, CronJobs
57
+ sort_order: 1
58
+ lessons:
59
+ - id: ckad-d1-l01
60
+ title: "Bài 1: Multi-container Pods & Init Containers"
61
+ slug: 01-multi-container-pods
62
+ description: >-
63
+ Sidecar pattern, Ambassador, Adapter patterns.
64
+ Init containers: sequencing, use cases.
65
+ Shared volumes giữa containers. Container ports.
66
+ Ephemeral containers cho debugging.
67
+ duration_minutes: 60
68
+ is_free: true
69
+ sort_order: 1
70
+ video_url: null
71
+ - id: ckad-d1-l02
72
+ title: "Bài 2: Jobs, CronJobs & Resource Management"
73
+ slug: 02-jobs-cronjobs-resources
74
+ description: >-
75
+ Job completions, parallelism, backoffLimit.
76
+ CronJob schedule syntax, concurrencyPolicy.
77
+ Resource requests vs limits. QoS classes: Guaranteed, Burstable, BestEffort.
78
+ LimitRange, ResourceQuota.
79
+ duration_minutes: 55
80
+ is_free: true
81
+ sort_order: 2
82
+ video_url: null
83
+
84
+ - id: ckad-section-02
85
+ title: "Domain 2: Application Deployment (20%)"
86
+ description: Rolling updates, rollbacks, Helm, Kustomize, deployment strategies
87
+ sort_order: 2
88
+ lessons:
89
+ - id: ckad-d2-l01
90
+ title: "Bài 3: Rolling Updates, Rollbacks & Deployment Strategies"
91
+ slug: 03-rolling-updates-rollbacks
92
+ description: >-
93
+ RollingUpdate vs Recreate strategy. maxUnavailable, maxSurge.
94
+ kubectl rollout history/undo/status. Blue-Green deployment.
95
+ Canary deployment với labels. Pause/resume rollouts.
96
+ duration_minutes: 60
97
+ is_free: true
98
+ sort_order: 3
99
+ video_url: null
100
+ - id: ckad-d2-l02
101
+ title: "Bài 4: Helm & Kustomize Basics"
102
+ slug: 04-helm-kustomize
103
+ description: >-
104
+ Helm chart structure: Chart.yaml, values.yaml, templates/.
105
+ helm install/upgrade/rollback. Helm hooks.
106
+ Kustomize: base + overlays, patches, namePrefix.
107
+ kubectl apply -k vs helm install.
108
+ duration_minutes: 55
109
+ is_free: true
110
+ sort_order: 4
111
+ video_url: null
112
+
113
+ - id: ckad-section-03
114
+ title: "Domain 3: Application Observability and Maintenance (15%)"
115
+ description: Probes, logging, monitoring, debugging
116
+ sort_order: 3
117
+ lessons:
118
+ - id: ckad-d3-l01
119
+ title: "Bài 5: Probes, Logging & Debugging"
120
+ slug: 05-probes-logging-debugging
121
+ description: >-
122
+ Liveness, Readiness, Startup probes: httpGet, exec, tcpSocket.
123
+ probe timing: initialDelaySeconds, periodSeconds, failureThreshold.
124
+ kubectl logs, stern. kubectl exec. Debugging crashed containers.
125
+ kubectl top (metrics-server). Events và conditions.
126
+ duration_minutes: 60
127
+ is_free: true
128
+ sort_order: 5
129
+ video_url: null
130
+
131
+ - id: ckad-section-04
132
+ title: "Domain 4: Application Environment, Configuration & Security (25%)"
133
+ description: ConfigMaps, Secrets, SecurityContext, ServiceAccounts, RBAC
134
+ sort_order: 4
135
+ lessons:
136
+ - id: ckad-d4-l01
137
+ title: "Bài 6: ConfigMaps & Secrets"
138
+ slug: 06-configmaps-secrets
139
+ description: >-
140
+ ConfigMap: từ literal, file, env. Inject qua env / envFrom / volume.
141
+ Secret types: Opaque, TLS, dockerconfigjson. Base64 encoding.
142
+ Secrets as volumes vs env vars. External Secrets overview.
143
+ duration_minutes: 55
144
+ is_free: true
145
+ sort_order: 6
146
+ video_url: null
147
+ - id: ckad-d4-l02
148
+ title: "Bài 7: SecurityContext & Pod Security"
149
+ slug: 07-securitycontext-pod-security
150
+ description: >-
151
+ runAsUser, runAsGroup, fsGroup. readOnlyRootFilesystem.
152
+ capabilities: add/drop. allowPrivilegeEscalation.
153
+ Pod Security Standards: Privileged, Baseline, Restricted.
154
+ ServiceAccount: automountServiceAccountToken, projected volumes.
155
+ duration_minutes: 60
156
+ is_free: true
157
+ sort_order: 7
158
+ video_url: null
159
+ - id: ckad-d4-l03
160
+ title: "Bài 8: Resource Requests, Limits & QoS"
161
+ slug: 08-resources-qos
162
+ description: >-
163
+ CPU (millicores) vs Memory (MiB/GiB) units. requests vs limits.
164
+ OOMKilled và CPU throttling. QoS classes chi tiết.
165
+ LimitRange per container/pod. ResourceQuota per namespace.
166
+ Horizontal Pod Autoscaler (HPA) basics.
167
+ duration_minutes: 55
168
+ is_free: true
169
+ sort_order: 8
170
+ video_url: null
171
+
172
+ - id: ckad-section-05
173
+ title: "Domain 5: Services & Networking (20%)"
174
+ description: Services, Ingress, Network Policies
175
+ sort_order: 5
176
+ lessons:
177
+ - id: ckad-d5-l01
178
+ title: "Bài 9: Services & Ingress"
179
+ slug: 09-services-ingress
180
+ description: >-
181
+ ClusterIP, NodePort, LoadBalancer, ExternalName. Headless service.
182
+ port vs targetPort vs nodePort. Ingress rules, path types.
183
+ TLS termination. Ingress class. Service vs Ingress use cases.
184
+ duration_minutes: 60
185
+ is_free: true
186
+ sort_order: 9
187
+ video_url: null
188
+ - id: ckad-d5-l02
189
+ title: "Bài 10: Network Policies & CKAD Exam Strategy"
190
+ slug: 10-networkpolicies-exam-strategy
191
+ description: >-
192
+ NetworkPolicy: podSelector, namespaceSelector, ipBlock.
193
+ Ingress vs Egress rules. Default deny patterns.
194
+ CKAD exam tips: imperative kubectl commands, --dry-run=client,
195
+ time management, bookmarking docs, common task templates.
196
+ duration_minutes: 60
197
+ is_free: true
198
+ sort_order: 10
199
+ video_url: null
@@ -0,0 +1,136 @@
1
+ ---
2
+ id: 019c9619-lt03-l01
3
+ title: 'Bài 1: Framing ML Problems — Supervised, Unsupervised, RL'
4
+ slug: bai-1-framing-ml-problems
5
+ description: >-
6
+ Cách xác định bài toán có cần ML không. Chọn đúng loại model.
7
+ Business metrics vs ML metrics. Data availability assessment.
8
+ Google's ML best practices.
9
+ duration_minutes: 50
10
+ is_free: true
11
+ video_url: null
12
+ sort_order: 1
13
+ section_title: "Phần 1: ML Problem Framing & Architecture"
14
+ course:
15
+ id: 019c9619-lt03-7003-c003-lt0300000003
16
+ title: 'Luyện thi Google Cloud Professional Machine Learning Engineer'
17
+ slug: luyen-thi-gcp-ml-engineer
18
+ ---
19
+
20
+ <div style="text-align: center; margin: 2rem 0;">
21
+ <img src="/storage/uploads/2026/04/gcp-mle-bai1-problem-framing.png" alt="ML Problem Framing Framework" style="max-width: 800px; width: 100%; border-radius: 12px;" />
22
+ <p><em>ML Problem Framing: xác định bài toán, chọn loại model, và định nghĩa metrics theo chuẩn Google</em></p>
23
+ </div>
24
+
25
+ <h2 id="when-to-use-ml"><strong>1. Khi Nào Cần Dùng ML?</strong></h2>
26
+
27
+ <p>Google ML certification thường hỏi về <strong>problem framing</strong> — tức là xác định xem bài toán có phù hợp để áp dụng ML không, và nếu có thì dùng loại ML nào. Đây là skill quan trọng của một professional ML Engineer.</p>
28
+
29
+ <table>
30
+ <thead><tr><th>Câu hỏi cần đặt ra</th><th>Nếu "Có"</th><th>Nếu "Không"</th></tr></thead>
31
+ <tbody>
32
+ <tr><td>Có pattern phức tạp trong data không?</td><td>ML có thể giúp</td><td>Rules-based logic đủ rồi</td></tr>
33
+ <tr><td>Có đủ data (labels) không?</td><td>Supervised Learning</td><td>Unsupervised hoặc thu thập thêm</td></tr>
34
+ <tr><td>Output có thể định nghĩa rõ ràng không?</td><td>Supervised ML</td><td>Cần clarify với stakeholders</td></tr>
35
+ <tr><td>Bài toán có cần agent tương tác với environment không?</td><td>Reinforcement Learning</td><td>Supervised/Unsupervised</td></tr>
36
+ </tbody>
37
+ </table>
38
+
39
+ <h2 id="ml-types"><strong>2. Các Loại ML và Khi Nào Dùng</strong></h2>
40
+
41
+ <pre><code class="language-text">Problem Framing Decision Tree:
42
+
43
+ Has labeled training data?
44
+ YES → Supervised Learning
45
+ ├── Output is category? → Classification
46
+ └── Output is number? → Regression
47
+
48
+ NO → Has examples, no labels?
49
+ YES → Unsupervised Learning
50
+ ├── Find groups? → Clustering
51
+ └── Find patterns/anomalies? → Density estimation
52
+ NO → Agent in environment?
53
+ YES → Reinforcement Learning
54
+ NO → Reconsider problem definition
55
+ </code></pre>
56
+
57
+ <table>
58
+ <thead><tr><th>ML Type</th><th>When to Use</th><th>GCP Services</th></tr></thead>
59
+ <tbody>
60
+ <tr><td><strong>Supervised Classification</strong></td><td>Email spam, image labels, churn prediction</td><td>Vertex AI AutoML, BigQuery ML</td></tr>
61
+ <tr><td><strong>Supervised Regression</strong></td><td>Price prediction, demand forecast</td><td>Vertex AI, BigQuery ML BQML_REGRESSOR</td></tr>
62
+ <tr><td><strong>Unsupervised Clustering</strong></td><td>Customer segmentation, topic discovery</td><td>Vertex AI Custom Training (k-means)</td></tr>
63
+ <tr><td><strong>Reinforcement Learning</strong></td><td>Game agents, robotics, ad bidding</td><td>Vertex AI + custom environment</td></tr>
64
+ <tr><td><strong>Self-supervised</strong></td><td>LLMs, foundation models</td><td>Vertex AI Model Garden</td></tr>
65
+ </tbody>
66
+ </table>
67
+
68
+ <h2 id="business-vs-ml-metrics"><strong>3. Business Metrics vs. ML Metrics</strong></h2>
69
+
70
+ <p>Một trong những sai lầm phổ biến là <strong>optimize nhầm metric</strong>. Mục tiêu ML phải align với mục tiêu business.</p>
71
+
72
+ <table>
73
+ <thead><tr><th>Business Goal</th><th>Wrong ML Metric</th><th>Correct ML Metric</th></tr></thead>
74
+ <tbody>
75
+ <tr><td>Giảm doanh thu bị gian lận</td><td>Accuracy (99%!)</td><td>Recall (bắt được nhiều fraud)</td></tr>
76
+ <tr><td>Giảm email spam trải nghiệm người dùng</td><td>Recall</td><td>Precision (ít false positive)</td></tr>
77
+ <tr><td>Dự báo nhu cầu tồn kho</td><td>MSE</td><td>MAPE (scale-independent)</td></tr>
78
+ <tr><td>Ranking sản phẩm trong search</td><td>Accuracy</td><td>NDCG, MRR (ranking metrics)</td></tr>
79
+ </tbody>
80
+ </table>
81
+
82
+ <blockquote>
83
+ <p><strong>Exam tip:</strong> Professional ML Engineer exam thường hỏi "which metric BEST aligns with the business objective". Khi thấy fraud/medical diagnosis → Recall. Khi thấy spam/precision-critical → Precision. Khi thấy class imbalance → F1 hoặc AUC-ROC.</p>
84
+ </blockquote>
85
+
86
+ <h2 id="data-assessment"><strong>4. Data Availability Assessment</strong></h2>
87
+
88
+ <table>
89
+ <thead><tr><th>Data Situation</th><th>ML Approach</th></tr></thead>
90
+ <tbody>
91
+ <tr><td>Nhiều labeled data</td><td>Fully supervised, train from scratch</td></tr>
92
+ <tr><td>Ít labeled data (&lt;1000)</td><td><strong>Transfer Learning</strong> (pre-trained + fine-tune)</td></tr>
93
+ <tr><td>Không có labels</td><td>Unsupervised hoặc thu thập labels (Vertex AI Data Labeling)</td></tr>
94
+ <tr><td>Labels tốn kém</td><td><strong>Active Learning</strong> — label uncertain samples trước</td></tr>
95
+ <tr><td>Dữ liệu không cân bằng</td><td>Oversampling, undersampling, class weights</td></tr>
96
+ </tbody>
97
+ </table>
98
+
99
+ <h2 id="google-ml-practices"><strong>5. Google's ML Best Practices</strong></h2>
100
+
101
+ <ul>
102
+ <li><strong>Start simple</strong>: Bắt đầu với model đơn giản nhất, sau đó phức tạp hóa dần</li>
103
+ <li><strong>Establish baseline</strong>: So sánh với heuristic/rules trước khi dùng ML</li>
104
+ <li><strong>Data quality first</strong>: 80% thời gian ML project là data preparation</li>
105
+ <li><strong>Reproducibility</strong>: Pipeline phải reproducible với cùng data</li>
106
+ <li><strong>Monitor in production</strong>: Model decay theo thời gian — cần continuous monitoring</li>
107
+ </ul>
108
+
109
+ <h2 id="practice"><strong>6. Practice Questions</strong></h2>
110
+
111
+ <p><strong>Q1:</strong> A company wants to identify which of its customers are most likely to cancel their subscription in the next 30 days. They have 3 years of historical customer behavior data with known churn events. Which ML approach should they use?</p>
112
+ <ul>
113
+ <li>A) Unsupervised clustering to find customer groups</li>
114
+ <li>B) Reinforcement learning to optimize retention campaigns</li>
115
+ <li>C) Supervised binary classification with historical churn labels ✓</li>
116
+ <li>D) Anomaly detection to find unusual behavior</li>
117
+ </ul>
118
+ <p><em>Explanation: This is a classic supervised classification problem (churn = yes/no). Historical data with known outcomes (churned/not churned) provides the labels needed. Clustering would not predict individual churn probability. RL is for sequential decision making, not prediction.</em></p>
119
+
120
+ <p><strong>Q2:</strong> A medical imaging ML model achieves 98% accuracy on test data but the business team is unsatisfied. The task is detecting rare cancer cells (1% prevalence). What is the most likely issue?</p>
121
+ <ul>
122
+ <li>A) The model is overfitting to training data</li>
123
+ <li>B) Accuracy is the wrong metric — the model may be predicting "no cancer" for everything ✓</li>
124
+ <li>C) The model needs more training iterations</li>
125
+ <li>D) The test dataset is too small</li>
126
+ </ul>
127
+ <p><em>Explanation: With 1% prevalence, a model always predicting "no cancer" achieves 99% accuracy but has 0% recall — it misses every cancer case. For rare class problems, Recall (sensitivity) is the critical metric, not accuracy.</em></p>
128
+
129
+ <p><strong>Q3:</strong> A startup has 500 labeled product images for a new custom classification task. Which training approach is MOST appropriate?</p>
130
+ <ul>
131
+ <li>A) Train a deep learning CNN from scratch on the 500 images</li>
132
+ <li>B) Use AutoML Tabular on the image metadata</li>
133
+ <li>C) Use Transfer Learning from a pre-trained image model ✓</li>
134
+ <li>D) Apply K-Means clustering since the dataset is too small</li>
135
+ </ul>
136
+ <p><em>Explanation: With only 500 labeled examples, training from scratch would overfit severely. Transfer Learning reuses features from a model pre-trained on millions of images (e.g., ImageNet), requiring far less data to achieve good accuracy on the new task.</em></p>