@google-cloud/storage-mcp 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +37 -16
- package/dist/bundle.js +488 -9
- package/dist/tsconfig.tsbuildinfo +1 -1
- package/package.json +4 -1
package/README.md
CHANGED
|
@@ -23,6 +23,18 @@ bucket and object management. With the Storage MCP server you can:
|
|
|
23
23
|
|
|
24
24
|
<img src="./assets/easy_access_3x.gif" width="80%" alt="Easy Access Demo">
|
|
25
25
|
|
|
26
|
+
- **Perform analytical and aggregation queries on your objects and buckets.** Perform
|
|
27
|
+
aggregations and compute statistics on entire storage inventory using [Storage Insights
|
|
28
|
+
Datasets](https://cloud.google.com/storage/docs/insights/datasets)
|
|
29
|
+
|
|
30
|
+
<img src="./assets/storage_insights_aggregation.gif" width="80%" alt="Easy Access Demo">
|
|
31
|
+
|
|
32
|
+
- **Run advanced filters and searches on your data.** Search and filter your objects
|
|
33
|
+
by file type, size and other metadata fields using [Storage Insights
|
|
34
|
+
Datasets](https://cloud.google.com/storage/docs/insights/datasets)
|
|
35
|
+
|
|
36
|
+
<img src="./assets/storage_insights_filter.gif" width="80%" alt="Easy Access Demo">
|
|
37
|
+
|
|
26
38
|
## 🚀 Getting Started
|
|
27
39
|
|
|
28
40
|
### Prerequisites
|
|
@@ -126,21 +138,24 @@ accidental data loss.
|
|
|
126
138
|
Safe tools are read-only or only create new objects without affecting existing
|
|
127
139
|
ones. They will never modify or delete existing data in GCS.
|
|
128
140
|
|
|
129
|
-
| Tool
|
|
130
|
-
|
|
|
131
|
-
| `list_buckets`
|
|
132
|
-
| `get_bucket_metadata`
|
|
133
|
-
| `get_bucket_location`
|
|
134
|
-
| `view_iam_policy`
|
|
135
|
-
| `check_iam_permissions`
|
|
136
|
-
| `create_bucket`
|
|
137
|
-
| `list_objects`
|
|
138
|
-
| `read_object_metadata`
|
|
139
|
-
| `read_object_content`
|
|
140
|
-
| `download_object`
|
|
141
|
-
| `write_object_new`
|
|
142
|
-
| `upload_object_new`
|
|
143
|
-
| `copy_object_new`
|
|
141
|
+
| Tool | Description |
|
|
142
|
+
| :-------------------------- | :-------------------------------------------------------------------------------------------------------------------------- |
|
|
143
|
+
| `list_buckets` | Lists all buckets in a project. |
|
|
144
|
+
| `get_bucket_metadata` | Gets comprehensive metadata for a specific bucket. |
|
|
145
|
+
| `get_bucket_location` | Gets the location of a bucket. |
|
|
146
|
+
| `view_iam_policy` | Views the IAM policy for a bucket. |
|
|
147
|
+
| `check_iam_permissions` | Tests IAM permissions for a bucket. |
|
|
148
|
+
| `create_bucket` | Creates a new bucket. Fails if the bucket already exists. |
|
|
149
|
+
| `list_objects` | Lists objects in a GCS bucket. |
|
|
150
|
+
| `read_object_metadata` | Reads comprehensive metadata for a specific object. |
|
|
151
|
+
| `read_object_content` | Reads the content of a specific object. |
|
|
152
|
+
| `download_object` | Downloads an object from GCS to a local file. |
|
|
153
|
+
| `write_object_new` | Writes a new object. Fails if the object already exists. |
|
|
154
|
+
| `upload_object_new` | Uploads a file to a new object. Fails if the object already exists. |
|
|
155
|
+
| `copy_object_new` | Copies an object to a new destination. Fails if the destination already exists. |
|
|
156
|
+
| `get_metadata_table_schema` | Checks if GCS insights service is enabled and returns the BigQuery table schema for a given insights dataset configuration. |
|
|
157
|
+
| `execute_insights_query` | Executes a BigQuery SQL query against an insights dataset and returns the result. |
|
|
158
|
+
| `list_insights_configs` | Lists the names of all Storage Insights dataset configurations for a given project. |
|
|
144
159
|
|
|
145
160
|
### Destructive Tools
|
|
146
161
|
|
|
@@ -179,7 +194,7 @@ We welcome contributions! Whether you're fixing bugs, sharing feedback, or
|
|
|
179
194
|
improving documentation, your contributions are welcome. Please read our
|
|
180
195
|
[Contributing Guide](CONTRIBUTING.md) to get started.
|
|
181
196
|
|
|
182
|
-
## 🎬
|
|
197
|
+
## 🎬 Demos
|
|
183
198
|
|
|
184
199
|
<p align="center"><b>Click to watch the Storage MCP demo</b><br/>
|
|
185
200
|
<a href="./assets/storage_mcp_demo.mp4" title="Click to play demo">
|
|
@@ -187,6 +202,12 @@ improving documentation, your contributions are welcome. Please read our
|
|
|
187
202
|
</a>
|
|
188
203
|
</p>
|
|
189
204
|
|
|
205
|
+
<p align="center"><b>Click to watch the Storage MCP demo powered by Storage Insights</b><br/>
|
|
206
|
+
<a href="./assets/storage_insights_demo.mp4" title="Click to play demo">
|
|
207
|
+
<img width="80%" alt="Storage Insights MCP Demo Video" src="./assets/storage_insights_demo_thumbnail.png">
|
|
208
|
+
</a>
|
|
209
|
+
</p>
|
|
210
|
+
|
|
190
211
|
## 📄 Important Notes
|
|
191
212
|
|
|
192
213
|
This repository is currently in preview and may see breaking changes. This
|
package/dist/bundle.js
CHANGED
|
@@ -13675,11 +13675,11 @@ var McpServer = class {
|
|
|
13675
13675
|
this._registeredPrompts[name] = registeredPrompt;
|
|
13676
13676
|
return registeredPrompt;
|
|
13677
13677
|
}
|
|
13678
|
-
_createRegisteredTool(name, title, description,
|
|
13678
|
+
_createRegisteredTool(name, title, description, inputSchema25, outputSchema, annotations, callback) {
|
|
13679
13679
|
const registeredTool = {
|
|
13680
13680
|
title,
|
|
13681
13681
|
description,
|
|
13682
|
-
inputSchema:
|
|
13682
|
+
inputSchema: inputSchema25 === void 0 ? void 0 : external_exports.object(inputSchema25),
|
|
13683
13683
|
outputSchema: outputSchema === void 0 ? void 0 : external_exports.object(outputSchema),
|
|
13684
13684
|
annotations,
|
|
13685
13685
|
callback,
|
|
@@ -13721,7 +13721,7 @@ var McpServer = class {
|
|
|
13721
13721
|
throw new Error(`Tool ${name} is already registered`);
|
|
13722
13722
|
}
|
|
13723
13723
|
let description;
|
|
13724
|
-
let
|
|
13724
|
+
let inputSchema25;
|
|
13725
13725
|
let outputSchema;
|
|
13726
13726
|
let annotations;
|
|
13727
13727
|
if (typeof rest[0] === "string") {
|
|
@@ -13730,7 +13730,7 @@ var McpServer = class {
|
|
|
13730
13730
|
if (rest.length > 1) {
|
|
13731
13731
|
const firstArg = rest[0];
|
|
13732
13732
|
if (isZodRawShape(firstArg)) {
|
|
13733
|
-
|
|
13733
|
+
inputSchema25 = rest.shift();
|
|
13734
13734
|
if (rest.length > 1 && typeof rest[0] === "object" && rest[0] !== null && !isZodRawShape(rest[0])) {
|
|
13735
13735
|
annotations = rest.shift();
|
|
13736
13736
|
}
|
|
@@ -13739,7 +13739,7 @@ var McpServer = class {
|
|
|
13739
13739
|
}
|
|
13740
13740
|
}
|
|
13741
13741
|
const callback = rest[0];
|
|
13742
|
-
return this._createRegisteredTool(name, void 0, description,
|
|
13742
|
+
return this._createRegisteredTool(name, void 0, description, inputSchema25, outputSchema, annotations, callback);
|
|
13743
13743
|
}
|
|
13744
13744
|
/**
|
|
13745
13745
|
* Registers a tool with a config object and callback.
|
|
@@ -13748,8 +13748,8 @@ var McpServer = class {
|
|
|
13748
13748
|
if (this._registeredTools[name]) {
|
|
13749
13749
|
throw new Error(`Tool ${name} is already registered`);
|
|
13750
13750
|
}
|
|
13751
|
-
const { title, description, inputSchema:
|
|
13752
|
-
return this._createRegisteredTool(name, title, description,
|
|
13751
|
+
const { title, description, inputSchema: inputSchema25, outputSchema, annotations } = config;
|
|
13752
|
+
return this._createRegisteredTool(name, title, description, inputSchema25, outputSchema, annotations, cb);
|
|
13753
13753
|
}
|
|
13754
13754
|
prompt(name, ...rest) {
|
|
13755
13755
|
if (this._registeredPrompts[name]) {
|
|
@@ -13851,10 +13851,16 @@ var EMPTY_COMPLETION_RESULT = {
|
|
|
13851
13851
|
};
|
|
13852
13852
|
|
|
13853
13853
|
// src/utility/api_client_factory.ts
|
|
13854
|
+
import { BigQuery } from "@google-cloud/bigquery";
|
|
13854
13855
|
import { Storage } from "@google-cloud/storage";
|
|
13856
|
+
import { ServiceUsageClient } from "@google-cloud/service-usage";
|
|
13857
|
+
import { StorageInsightsClient } from "@google-cloud/storageinsights";
|
|
13855
13858
|
var ApiClientFactory = class _ApiClientFactory {
|
|
13856
13859
|
static instance;
|
|
13857
13860
|
storageClient;
|
|
13861
|
+
serviceUsageClient;
|
|
13862
|
+
storageInsightsClient;
|
|
13863
|
+
bigqueryClient;
|
|
13858
13864
|
constructor() {
|
|
13859
13865
|
}
|
|
13860
13866
|
static getInstance() {
|
|
@@ -13869,6 +13875,24 @@ var ApiClientFactory = class _ApiClientFactory {
|
|
|
13869
13875
|
}
|
|
13870
13876
|
return this.storageClient;
|
|
13871
13877
|
}
|
|
13878
|
+
getServiceUsageClient() {
|
|
13879
|
+
if (!this.serviceUsageClient) {
|
|
13880
|
+
this.serviceUsageClient = new ServiceUsageClient();
|
|
13881
|
+
}
|
|
13882
|
+
return this.serviceUsageClient;
|
|
13883
|
+
}
|
|
13884
|
+
getStorageInsightsClient() {
|
|
13885
|
+
if (!this.storageInsightsClient) {
|
|
13886
|
+
this.storageInsightsClient = new StorageInsightsClient();
|
|
13887
|
+
}
|
|
13888
|
+
return this.storageInsightsClient;
|
|
13889
|
+
}
|
|
13890
|
+
getBigQueryClient() {
|
|
13891
|
+
if (!this.bigqueryClient) {
|
|
13892
|
+
this.bigqueryClient = new BigQuery();
|
|
13893
|
+
}
|
|
13894
|
+
return this.bigqueryClient;
|
|
13895
|
+
}
|
|
13872
13896
|
};
|
|
13873
13897
|
var apiClientFactory = ApiClientFactory.getInstance();
|
|
13874
13898
|
|
|
@@ -15588,6 +15612,443 @@ var registerWriteObjectSafeTool = (server) => {
|
|
|
15588
15612
|
);
|
|
15589
15613
|
};
|
|
15590
15614
|
|
|
15615
|
+
// src/tools/insights/get_metadata_table_schema.ts
|
|
15616
|
+
var serviceName = "storageinsights.googleapis.com";
|
|
15617
|
+
var inputSchema22 = {
|
|
15618
|
+
datasetConfigName: external_exports.string().describe("The name of the dataset configuration."),
|
|
15619
|
+
datasetConfigLocation: external_exports.string().describe("The location of the dataset configuration."),
|
|
15620
|
+
projectId: external_exports.string().optional().describe("The project ID to check Storage Insights availability for.")
|
|
15621
|
+
};
|
|
15622
|
+
async function getMetadataTableSchema(params) {
|
|
15623
|
+
const bigqueryClient = apiClientFactory.getBigQueryClient();
|
|
15624
|
+
const storageInsightsClient = apiClientFactory.getStorageInsightsClient();
|
|
15625
|
+
const serviceUsageClient = apiClientFactory.getServiceUsageClient();
|
|
15626
|
+
const projectId = params.projectId || process.env["GOOGLE_CLOUD_PROJECT"] || process.env["GCP_PROJECT_ID"];
|
|
15627
|
+
if (!projectId) {
|
|
15628
|
+
throw new Error(
|
|
15629
|
+
"Project ID not specified. Please specify via the projectId parameter or GOOGLE_CLOUD_PROJECT or GCP_PROJECT_ID environment variable."
|
|
15630
|
+
);
|
|
15631
|
+
}
|
|
15632
|
+
const [services] = await serviceUsageClient.listServices({
|
|
15633
|
+
parent: `projects/${projectId}`,
|
|
15634
|
+
filter: "state:ENABLED"
|
|
15635
|
+
});
|
|
15636
|
+
const isEnabled = services.some(
|
|
15637
|
+
(service) => service.config?.name === serviceName
|
|
15638
|
+
);
|
|
15639
|
+
if (!isEnabled) {
|
|
15640
|
+
throw new Error(
|
|
15641
|
+
`Storage Insights API is not enabled for project ${projectId}. Please enable it to proceed.`
|
|
15642
|
+
);
|
|
15643
|
+
}
|
|
15644
|
+
let config;
|
|
15645
|
+
try {
|
|
15646
|
+
[config] = await storageInsightsClient.getDatasetConfig({
|
|
15647
|
+
name: `projects/${projectId}/locations/${params.datasetConfigLocation}/datasetConfigs/${params.datasetConfigName}`
|
|
15648
|
+
});
|
|
15649
|
+
} catch (error) {
|
|
15650
|
+
const err = error instanceof Error ? error : void 0;
|
|
15651
|
+
logger.error("Error getting dataset config:", err);
|
|
15652
|
+
return {
|
|
15653
|
+
content: [
|
|
15654
|
+
{
|
|
15655
|
+
type: "text",
|
|
15656
|
+
text: JSON.stringify({
|
|
15657
|
+
error: "Failed to retrieve dataset configuration",
|
|
15658
|
+
details: err?.message
|
|
15659
|
+
})
|
|
15660
|
+
}
|
|
15661
|
+
]
|
|
15662
|
+
};
|
|
15663
|
+
}
|
|
15664
|
+
const objectHints = /* @__PURE__ */ new Map([
|
|
15665
|
+
["snapshotTime", "The snapshot time of the object metadata in RFC 3339 format."],
|
|
15666
|
+
["bucket", "The name of the bucket containing this object."],
|
|
15667
|
+
["location", "The location of the source bucket."],
|
|
15668
|
+
[
|
|
15669
|
+
"componentCount",
|
|
15670
|
+
"Returned for composite objects only. Number of non-composite objects in the composite object."
|
|
15671
|
+
],
|
|
15672
|
+
["contentDisposition", "Content-Disposition of the object data."],
|
|
15673
|
+
["contentEncoding", "Content-Encoding of the object data."],
|
|
15674
|
+
["contentLanguage", "Content-Language of the object data."],
|
|
15675
|
+
[
|
|
15676
|
+
"contentType",
|
|
15677
|
+
"Content-Type of the object data. If an object is stored without a Content-Type, it is served as application/octet-stream."
|
|
15678
|
+
],
|
|
15679
|
+
[
|
|
15680
|
+
"crc32c",
|
|
15681
|
+
"CRC32c checksum, as described in RFC 4960, Appendix B; encoded using base64 in big-endian byte order."
|
|
15682
|
+
],
|
|
15683
|
+
["customTime", "A user-specified timestamp for the object in RFC 3339 format."],
|
|
15684
|
+
["etag", "HTTP 1.1 Entity tag for the object."],
|
|
15685
|
+
["eventBasedHold", "Whether or not the object is subject to an event-based hold."],
|
|
15686
|
+
["generation", "The content generation of this object. Used for object versioning."],
|
|
15687
|
+
[
|
|
15688
|
+
"md5Hash",
|
|
15689
|
+
"MD5 hash of the data, encoded using base64. This field is not present for composite objects."
|
|
15690
|
+
],
|
|
15691
|
+
["mediaLink", "A URL for downloading the object's data."],
|
|
15692
|
+
["metadata", "User-provided metadata, in key/value pairs."],
|
|
15693
|
+
["metadata.key", "An individual metadata entry key."],
|
|
15694
|
+
["metadata.value", "An individual metadata entry value."],
|
|
15695
|
+
["metageneration", "The version of the metadata for this object at this generation."],
|
|
15696
|
+
["name", "The name of the object."],
|
|
15697
|
+
["selfLink", "A URL for this object."],
|
|
15698
|
+
["size", "Content-Length of the data in bytes."],
|
|
15699
|
+
["storageClass", "Storage class of the object."],
|
|
15700
|
+
["temporaryHold", "Whether or not the object is subject to a temporary hold."],
|
|
15701
|
+
["timeCreated", "The creation time of the object in RFC 3339 format."],
|
|
15702
|
+
[
|
|
15703
|
+
"timeDeleted",
|
|
15704
|
+
"The deletion time of the object in RFC 3339 format. Returned if and only if this version of the object is no longer a live version, but remains in the bucket as a noncurrent version."
|
|
15705
|
+
],
|
|
15706
|
+
["updated", "The modification time of the object metadata in RFC 3339 format."],
|
|
15707
|
+
["timeStorageClassUpdated", "The time at which the object's storage class was last changed."],
|
|
15708
|
+
[
|
|
15709
|
+
"retentionExpirationTime",
|
|
15710
|
+
"The earliest time that the object can be deleted, in RFC 3339 format."
|
|
15711
|
+
],
|
|
15712
|
+
[
|
|
15713
|
+
"softDeleteTime",
|
|
15714
|
+
"If this object has been soft-deleted, this is the time at which it became soft-deleted."
|
|
15715
|
+
],
|
|
15716
|
+
[
|
|
15717
|
+
"hardDeleteTime",
|
|
15718
|
+
"This is the time (in the future) when the object will no longer be restorable."
|
|
15719
|
+
],
|
|
15720
|
+
["project", "The project number of the project the bucket belongs to."]
|
|
15721
|
+
]);
|
|
15722
|
+
const bucketHints = /* @__PURE__ */ new Map([
|
|
15723
|
+
["snapshotTime", "The snapshot time of the metadata in RFC 3339 format."],
|
|
15724
|
+
["name", "The name of the source bucket."],
|
|
15725
|
+
["location", 'The location of the source bucket (e.g., "US", "EU", "ASIA-EAST1").'],
|
|
15726
|
+
["project", "The project number of the project the bucket belongs to."],
|
|
15727
|
+
[
|
|
15728
|
+
"storageClass",
|
|
15729
|
+
`The bucket's default storage class (e.g., "STANDARD", "NEARLINE", "COLDLINE").`
|
|
15730
|
+
],
|
|
15731
|
+
[
|
|
15732
|
+
"public.bucketPolicyOnly",
|
|
15733
|
+
"Deprecated field. Whether to enforcement uniform bucket-level access. This concept is now represented by iamConfiguration.uniformBucketLevelAccess.enabled."
|
|
15734
|
+
],
|
|
15735
|
+
[
|
|
15736
|
+
"public.publicAccessPrevention",
|
|
15737
|
+
`The bucket's public access prevention status ("inherited" or "enforced"). This is the same setting as iamConfiguration.publicAccessPrevention.`
|
|
15738
|
+
],
|
|
15739
|
+
["autoclass.enabled", "Whether Autoclass is enabled for the bucket."],
|
|
15740
|
+
["autoclass.toggleTime", "The time Autoclass was last enabled or disabled."],
|
|
15741
|
+
["versioning", "Boolean indicating if Object Versioning is enabled for the bucket."],
|
|
15742
|
+
[
|
|
15743
|
+
"lifecycle",
|
|
15744
|
+
"Boolean indicating if the bucket has an Object Lifecycle Management configuration."
|
|
15745
|
+
],
|
|
15746
|
+
["metageneration", "The metadata generation of this bucket."],
|
|
15747
|
+
[
|
|
15748
|
+
"timeCreated",
|
|
15749
|
+
"The creation time of the bucket in RFC 3339 format. To perform date calculations, use DATE_SUB or DATE_ADD with CURRENT_DATE()"
|
|
15750
|
+
],
|
|
15751
|
+
["tags.tagMap.key", "The key of a tag."],
|
|
15752
|
+
["tags.tagMap.value", "The value of a tag."],
|
|
15753
|
+
["tags.lastUpdatedTime", "The last updated time for the tags."],
|
|
15754
|
+
["labels.key", "An individual label entry key."],
|
|
15755
|
+
["labels.value", "An individual label entry value."],
|
|
15756
|
+
[
|
|
15757
|
+
"softDeletePolicy.retentionDurationSeconds",
|
|
15758
|
+
"The duration in seconds that soft-deleted objects will be retained."
|
|
15759
|
+
],
|
|
15760
|
+
[
|
|
15761
|
+
"softDeletePolicy.effectiveTime",
|
|
15762
|
+
"The time from which the soft delete policy became effective."
|
|
15763
|
+
],
|
|
15764
|
+
[
|
|
15765
|
+
"iamConfiguration.uniformBucketLevelAccess.enabled",
|
|
15766
|
+
"If True, Uniform bucket-level access is enabled, disabling object-level ACLs. This replaces the legacy public.bucketPolicyOnly field."
|
|
15767
|
+
],
|
|
15768
|
+
[
|
|
15769
|
+
"iamConfiguration.publicAccessPrevention",
|
|
15770
|
+
`The bucket's public access prevention status ("inherited" or "enforced"). This is the same setting as public.publicAccessPrevention.`
|
|
15771
|
+
],
|
|
15772
|
+
[
|
|
15773
|
+
"resourceTags",
|
|
15774
|
+
"This field appears to be redundant. Bucket resource tags are properly represented under the tags field."
|
|
15775
|
+
],
|
|
15776
|
+
[
|
|
15777
|
+
"objectCount",
|
|
15778
|
+
"Total number of objects in the bucket. This is a recent addition for aggregated bucket metrics."
|
|
15779
|
+
],
|
|
15780
|
+
[
|
|
15781
|
+
"totalSize",
|
|
15782
|
+
"Total size of the bucket in bytes. This is a recent addition for aggregated bucket metrics."
|
|
15783
|
+
]
|
|
15784
|
+
]);
|
|
15785
|
+
try {
|
|
15786
|
+
const linkedDataset = config.link?.dataset;
|
|
15787
|
+
if (linkedDataset) {
|
|
15788
|
+
const parts = linkedDataset.split("/");
|
|
15789
|
+
const datasetId = parts[parts.length - 1];
|
|
15790
|
+
if (!datasetId) {
|
|
15791
|
+
throw new Error("Could not extract dataset ID from linked dataset.");
|
|
15792
|
+
}
|
|
15793
|
+
const bucketViewId = "bucket_attributes_latest_snapshot_view";
|
|
15794
|
+
const objectViewId = "object_attributes_latest_snapshot_view";
|
|
15795
|
+
const [bucketViewMetadata] = await bigqueryClient.dataset(datasetId).table(bucketViewId).getMetadata();
|
|
15796
|
+
const [objectViewMetadata] = await bigqueryClient.dataset(datasetId).table(objectViewId).getMetadata();
|
|
15797
|
+
const bucketViewFields = bucketViewMetadata.schema.fields.map(
|
|
15798
|
+
(field) => {
|
|
15799
|
+
const fieldWithHint = { ...field };
|
|
15800
|
+
if (field.name && bucketHints.has(field.name)) {
|
|
15801
|
+
fieldWithHint.hint = bucketHints.get(field.name);
|
|
15802
|
+
}
|
|
15803
|
+
return fieldWithHint;
|
|
15804
|
+
}
|
|
15805
|
+
);
|
|
15806
|
+
const objectViewFields = objectViewMetadata.schema.fields.map(
|
|
15807
|
+
(field) => {
|
|
15808
|
+
const fieldWithHint = { ...field };
|
|
15809
|
+
if (field.name && objectHints.has(field.name)) {
|
|
15810
|
+
fieldWithHint.hint = objectHints.get(field.name);
|
|
15811
|
+
}
|
|
15812
|
+
return fieldWithHint;
|
|
15813
|
+
}
|
|
15814
|
+
);
|
|
15815
|
+
const result = {
|
|
15816
|
+
[`${datasetId}.${bucketViewId}`]: bucketViewFields,
|
|
15817
|
+
[`${datasetId}.${objectViewId}`]: objectViewFields,
|
|
15818
|
+
...config
|
|
15819
|
+
};
|
|
15820
|
+
return {
|
|
15821
|
+
content: [
|
|
15822
|
+
{
|
|
15823
|
+
type: "text",
|
|
15824
|
+
text: JSON.stringify(result)
|
|
15825
|
+
}
|
|
15826
|
+
]
|
|
15827
|
+
};
|
|
15828
|
+
}
|
|
15829
|
+
throw new Error("Configuration does not have a linked dataset.");
|
|
15830
|
+
} catch (error) {
|
|
15831
|
+
const err = error instanceof Error ? error : void 0;
|
|
15832
|
+
logger.error("Error getting metadata table schema:", err);
|
|
15833
|
+
return {
|
|
15834
|
+
content: [
|
|
15835
|
+
{
|
|
15836
|
+
type: "text",
|
|
15837
|
+
text: JSON.stringify({
|
|
15838
|
+
error: "Failed to get metadata table schema",
|
|
15839
|
+
details: err?.message
|
|
15840
|
+
})
|
|
15841
|
+
}
|
|
15842
|
+
]
|
|
15843
|
+
};
|
|
15844
|
+
}
|
|
15845
|
+
}
|
|
15846
|
+
var registerGetMetadataTableSchemaTool = (server) => {
|
|
15847
|
+
server.registerTool(
|
|
15848
|
+
"get_metadata_table_schema",
|
|
15849
|
+
{
|
|
15850
|
+
description: "Checks if GCS insights service is enabled and returns the BigQuery table schema for a given insights dataset configuration in JSON format. Also returns hints for each column in the table",
|
|
15851
|
+
inputSchema: inputSchema22,
|
|
15852
|
+
annotations: {
|
|
15853
|
+
displayOutput: false
|
|
15854
|
+
}
|
|
15855
|
+
},
|
|
15856
|
+
getMetadataTableSchema
|
|
15857
|
+
);
|
|
15858
|
+
};
|
|
15859
|
+
|
|
15860
|
+
// src/tools/insights/execute_insights_query.ts
|
|
15861
|
+
var inputSchema23 = {
|
|
15862
|
+
config: external_exports.string().describe(
|
|
15863
|
+
"The JSON object of the BigQuery table schema for a given insights dataset configuration."
|
|
15864
|
+
),
|
|
15865
|
+
query: external_exports.string().describe("The BigQuery SQL query to execute."),
|
|
15866
|
+
jobTimeoutMs: external_exports.number().optional().default(2e4).describe("The maximum amount of time for the job to run on the server.")
|
|
15867
|
+
};
|
|
15868
|
+
async function executeInsightsQuery(params) {
|
|
15869
|
+
const bigqueryClient = apiClientFactory.getBigQueryClient();
|
|
15870
|
+
try {
|
|
15871
|
+
let config;
|
|
15872
|
+
try {
|
|
15873
|
+
config = JSON.parse(params.config);
|
|
15874
|
+
} catch (_e) {
|
|
15875
|
+
throw new Error("Invalid configuration provided. Expected a JSON object or a JSON string.");
|
|
15876
|
+
}
|
|
15877
|
+
if (typeof config !== "object" || config === null) {
|
|
15878
|
+
throw new Error("Invalid configuration provided. Expected a JSON object.");
|
|
15879
|
+
}
|
|
15880
|
+
const linkedDataset = config.link?.dataset;
|
|
15881
|
+
if (!linkedDataset) {
|
|
15882
|
+
throw new Error("The provided configuration is missing the `link.dataset` property.");
|
|
15883
|
+
}
|
|
15884
|
+
const nameParts = config.name?.split("/");
|
|
15885
|
+
if (!nameParts || nameParts.length < 4) {
|
|
15886
|
+
throw new Error(
|
|
15887
|
+
"Invalid configuration name format. Expected `projects/{projectId}/locations/{locationId}/datasetConfigs/{datasetConfigId}`."
|
|
15888
|
+
);
|
|
15889
|
+
}
|
|
15890
|
+
const projectId = nameParts[1];
|
|
15891
|
+
const datasetId = linkedDataset.split("/").pop();
|
|
15892
|
+
const location = nameParts[3];
|
|
15893
|
+
if (!location) {
|
|
15894
|
+
throw new Error("Could not extract location from the configuration name.");
|
|
15895
|
+
}
|
|
15896
|
+
if (!datasetId) {
|
|
15897
|
+
throw new Error("Could not extract datasetId from the linked dataset.");
|
|
15898
|
+
}
|
|
15899
|
+
const baseQueryOptions = {
|
|
15900
|
+
query: params.query,
|
|
15901
|
+
jobTimeoutMs: params.jobTimeoutMs,
|
|
15902
|
+
location
|
|
15903
|
+
};
|
|
15904
|
+
const options = {};
|
|
15905
|
+
if (projectId) {
|
|
15906
|
+
options.projectId = projectId;
|
|
15907
|
+
}
|
|
15908
|
+
logger.info(`Executing query with location: ${location}`);
|
|
15909
|
+
logger.info(`Executing query with datasetId: ${datasetId}`);
|
|
15910
|
+
logger.info(`Executing query with projectId: ${projectId}`);
|
|
15911
|
+
logger.info("Performing BigQuery dry run...");
|
|
15912
|
+
try {
|
|
15913
|
+
const [dryRunJob] = await bigqueryClient.dataset(datasetId, options).createQueryJob({
|
|
15914
|
+
...baseQueryOptions,
|
|
15915
|
+
dryRun: true
|
|
15916
|
+
});
|
|
15917
|
+
logger.info(`Dry run successful for query. Job ID: ${dryRunJob.id}`);
|
|
15918
|
+
} catch (error) {
|
|
15919
|
+
const err = error;
|
|
15920
|
+
logger.error("BigQuery dry run failed:", err);
|
|
15921
|
+
return {
|
|
15922
|
+
content: [
|
|
15923
|
+
{
|
|
15924
|
+
type: "text",
|
|
15925
|
+
text: JSON.stringify({
|
|
15926
|
+
error: "Validation failed: Invalid BigQuery SQL or access error during dry run",
|
|
15927
|
+
error_type: "QueryValidationError",
|
|
15928
|
+
details: err?.message
|
|
15929
|
+
})
|
|
15930
|
+
}
|
|
15931
|
+
]
|
|
15932
|
+
};
|
|
15933
|
+
}
|
|
15934
|
+
logger.info("Dry run passed. Executing BigQuery query...");
|
|
15935
|
+
const [job] = await bigqueryClient.dataset(datasetId, options).createQueryJob(baseQueryOptions);
|
|
15936
|
+
logger.info(`Job ${job.id} started.`);
|
|
15937
|
+
const [rows] = await job.getQueryResults();
|
|
15938
|
+
logger.info(`Successfully executed query.`);
|
|
15939
|
+
return {
|
|
15940
|
+
content: [
|
|
15941
|
+
{
|
|
15942
|
+
type: "text",
|
|
15943
|
+
text: JSON.stringify(rows)
|
|
15944
|
+
}
|
|
15945
|
+
]
|
|
15946
|
+
};
|
|
15947
|
+
} catch (error) {
|
|
15948
|
+
const err = error;
|
|
15949
|
+
logger.error("Error executing insights query:", err);
|
|
15950
|
+
let errorType = "Unknown";
|
|
15951
|
+
if (err.message.includes("Job timed out")) {
|
|
15952
|
+
errorType = "Timeout";
|
|
15953
|
+
}
|
|
15954
|
+
return {
|
|
15955
|
+
content: [
|
|
15956
|
+
{
|
|
15957
|
+
type: "text",
|
|
15958
|
+
text: JSON.stringify({
|
|
15959
|
+
error: "Failed to execute insights query",
|
|
15960
|
+
error_type: errorType,
|
|
15961
|
+
details: err?.message
|
|
15962
|
+
})
|
|
15963
|
+
}
|
|
15964
|
+
]
|
|
15965
|
+
};
|
|
15966
|
+
}
|
|
15967
|
+
}
|
|
15968
|
+
var registerExecuteInsightsQueryTool = (server) => {
|
|
15969
|
+
server.registerTool(
|
|
15970
|
+
"execute_insights_query",
|
|
15971
|
+
{
|
|
15972
|
+
description: "Executes a BigQuery SQL query against an insights dataset and returns the result.",
|
|
15973
|
+
inputSchema: inputSchema23
|
|
15974
|
+
},
|
|
15975
|
+
executeInsightsQuery
|
|
15976
|
+
);
|
|
15977
|
+
};
|
|
15978
|
+
|
|
15979
|
+
// src/tools/insights/list_insights_configs.ts
|
|
15980
|
+
var serviceName2 = "storageinsights.googleapis.com";
|
|
15981
|
+
var inputSchema24 = {
|
|
15982
|
+
projectId: external_exports.string().optional().describe("The project ID to list Storage Insights dataset configurations for.")
|
|
15983
|
+
};
|
|
15984
|
+
async function listInsightsConfigs(params) {
|
|
15985
|
+
const storageInsightsClient = apiClientFactory.getStorageInsightsClient();
|
|
15986
|
+
const serviceUsageClient = apiClientFactory.getServiceUsageClient();
|
|
15987
|
+
const projectId = params.projectId || process.env["GOOGLE_CLOUD_PROJECT"] || process.env["GCP_PROJECT_ID"];
|
|
15988
|
+
if (!projectId) {
|
|
15989
|
+
throw new Error(
|
|
15990
|
+
"Project ID not specified. Please specify via the projectId parameter or GOOGLE_CLOUD_PROJECT or GCP_PROJECT_ID environment variable."
|
|
15991
|
+
);
|
|
15992
|
+
}
|
|
15993
|
+
const [services] = await serviceUsageClient.listServices({
|
|
15994
|
+
parent: `projects/${projectId}`,
|
|
15995
|
+
filter: "state:ENABLED"
|
|
15996
|
+
});
|
|
15997
|
+
const isEnabled = services.some(
|
|
15998
|
+
(service) => service.config?.name === serviceName2
|
|
15999
|
+
);
|
|
16000
|
+
if (!isEnabled) {
|
|
16001
|
+
throw new Error(
|
|
16002
|
+
`Storage Insights API is not enabled for project ${projectId}. Please enable it to proceed.`
|
|
16003
|
+
);
|
|
16004
|
+
}
|
|
16005
|
+
try {
|
|
16006
|
+
const parent = `projects/${projectId}/locations/-`;
|
|
16007
|
+
const iterable = storageInsightsClient.listDatasetConfigsAsync({ parent });
|
|
16008
|
+
const configNames = [];
|
|
16009
|
+
for await (const config of iterable) {
|
|
16010
|
+
if (config.name) {
|
|
16011
|
+
configNames.push(config.name);
|
|
16012
|
+
}
|
|
16013
|
+
}
|
|
16014
|
+
logger.info(`Successfully listed ${configNames.length} dataset config names.`);
|
|
16015
|
+
return {
|
|
16016
|
+
content: [
|
|
16017
|
+
{
|
|
16018
|
+
type: "text",
|
|
16019
|
+
text: JSON.stringify({
|
|
16020
|
+
configurations: configNames
|
|
16021
|
+
})
|
|
16022
|
+
}
|
|
16023
|
+
]
|
|
16024
|
+
};
|
|
16025
|
+
} catch (error) {
|
|
16026
|
+
const err = error instanceof Error ? error : void 0;
|
|
16027
|
+
logger.error("Error listing dataset configs:", err);
|
|
16028
|
+
return {
|
|
16029
|
+
content: [
|
|
16030
|
+
{
|
|
16031
|
+
type: "text",
|
|
16032
|
+
text: JSON.stringify({
|
|
16033
|
+
error: "Failed to list dataset configurations",
|
|
16034
|
+
details: err?.message
|
|
16035
|
+
})
|
|
16036
|
+
}
|
|
16037
|
+
]
|
|
16038
|
+
};
|
|
16039
|
+
}
|
|
16040
|
+
}
|
|
16041
|
+
var registerListInsightsConfigsTool = (server) => {
|
|
16042
|
+
server.registerTool(
|
|
16043
|
+
"list_insights_configs",
|
|
16044
|
+
{
|
|
16045
|
+
description: "Lists the names of all Storage Insights dataset configurations for a given project.",
|
|
16046
|
+
inputSchema: inputSchema24
|
|
16047
|
+
},
|
|
16048
|
+
listInsightsConfigs
|
|
16049
|
+
);
|
|
16050
|
+
};
|
|
16051
|
+
|
|
15591
16052
|
// src/tools/index.ts
|
|
15592
16053
|
var commonSafeTools = [
|
|
15593
16054
|
registerListBucketsTool,
|
|
@@ -15600,7 +16061,10 @@ var commonSafeTools = [
|
|
|
15600
16061
|
registerReadObjectContentTool,
|
|
15601
16062
|
registerReadObjectMetadataTool,
|
|
15602
16063
|
registerDownloadObjectTool,
|
|
15603
|
-
registerDeleteObjectTool
|
|
16064
|
+
registerDeleteObjectTool,
|
|
16065
|
+
registerGetMetadataTableSchemaTool,
|
|
16066
|
+
registerExecuteInsightsQueryTool,
|
|
16067
|
+
registerListInsightsConfigsTool
|
|
15604
16068
|
];
|
|
15605
16069
|
var safeWriteTools = [
|
|
15606
16070
|
registerWriteObjectSafeTool,
|
|
@@ -15717,7 +16181,7 @@ var StdioServerTransport = class {
|
|
|
15717
16181
|
// package.json
|
|
15718
16182
|
var package_default = {
|
|
15719
16183
|
name: "@google-cloud/storage-mcp",
|
|
15720
|
-
version: "0.
|
|
16184
|
+
version: "0.3.0",
|
|
15721
16185
|
type: "module",
|
|
15722
16186
|
main: "dist/bundle.js",
|
|
15723
16187
|
bin: {
|
|
@@ -15777,7 +16241,10 @@ var package_default = {
|
|
|
15777
16241
|
vitest: "^3.2.4"
|
|
15778
16242
|
},
|
|
15779
16243
|
dependencies: {
|
|
16244
|
+
"@google-cloud/bigquery": "^7.0.0",
|
|
16245
|
+
"@google-cloud/service-usage": "^4.2.0",
|
|
15780
16246
|
"@google-cloud/storage": "^7.17.1",
|
|
16247
|
+
"@google-cloud/storageinsights": "^2.2.0",
|
|
15781
16248
|
"@modelcontextprotocol/sdk": "^1.17.1",
|
|
15782
16249
|
"@types/yargs": "^17.0.33",
|
|
15783
16250
|
chardet: "^2.1.0",
|
|
@@ -21141,6 +21608,18 @@ For example, you can use the tools to:
|
|
|
21141
21608
|
|
|
21142
21609
|
- **Clarify Ambiguity:** Do not guess or assume values for required parameters like bucket names. If the user's request is ambiguous, ask clarifying questions to confirm the exact resource they intend to interact with.
|
|
21143
21610
|
- **Use Defaults:** If a \`project_id\` is not specified by the user, you can use the default value configured in the environment.
|
|
21611
|
+
- **Terminology Grounding:** When users ask for "insights datasets," they are referring to BigQuery datasets populated by Storage Insights Configurations.
|
|
21612
|
+
Do NOT confuse this with legacy "Inventory Reports" (which export CSV/Parquet files to GCS).
|
|
21613
|
+
- **Insights Intent Identification:** There is a set of 3 tools called Insights tool which you have to use whenever you detect an Storage Insights intent.
|
|
21614
|
+
To identify "Storage Insights Intent" queries, look for prompts focused on analyzing Google Cloud Storage configuration metadata. These requests typically stem from cost optimization, security auditing, or data governance needs. Users ask to aggregate, filter, or or analyze "buckets" and "objects" based on attributes like "size," "age," "location," and "storage class." Key indicators include checks on configurations such as "public access prevention," "versioning," and "lifecycle policies." The queries often involve complex operations, including statistical analysis (correlations, percentiles), ranking, and filtering based on user-defined "tags" or "labels" across projects and regions.
|
|
21615
|
+
- **Insights Intent Handling:** Queries identified to have a "Storage Insights Intent" MUST be handled by generating BigQuery SQL queries against Storage Insights datasets. Example: "What is the total size of my storage?", "Which bucket has the most objects?", "Show the distribution of my storage classes."
|
|
21616
|
+
If the intent is classified as " Storage Insights Intent", proceed to execute the below steps in order:
|
|
21617
|
+
|
|
21618
|
+
1. Check if the user provided the \`config_name\` and \`config_location\` of the dataset configuration. If not then ask the user for the \`config_name\` and \`config_location\` of the dataset configuration they would like to use and remember it.
|
|
21619
|
+
2. If the dataset configuration is available then pass this dataset configuration name and location to get_metadata_table_schema which will return the schema with some hints. If the dataset config name is returned as incorrect, call the list_insights_configs tool and then ask the user to select the correct dataset configuration name and location again and don't list the available configs unless user explicitly asks for it and retry getting the metadata table schema. Remember the schema for the remaining session unless user asks to change the dataset.
|
|
21620
|
+
3. Once you have the dataset table schema, use it to draft query/queries and call the execute_insights_query tool get relevant data. If the query fails due to some reason, correct it and retry.
|
|
21621
|
+
**Note on BigQuery Table References:** When constructing BigQuery SQL queries, ensure that table references are fully qualified with the project ID. The format should be \`project_id.dataset_id.table_id\`. For example, if the project ID is \`my-gcp-project\`, the dataset ID is \`my_dataset\`, and the table ID is \`my_table\`, the reference in the query should be \`my-gcp-project.my_dataset.my_table\`.
|
|
21622
|
+
4. Based on the query results, answer the users query.
|
|
21144
21623
|
|
|
21145
21624
|
## GCS Reference Documentation
|
|
21146
21625
|
|