@google-cloud/storage-mcp 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -23,6 +23,18 @@ bucket and object management. With the Storage MCP server you can:
23
23
 
24
24
  <img src="./assets/easy_access_3x.gif" width="80%" alt="Easy Access Demo">
25
25
 
26
+ - **Perform analytical and aggregation queries on your objects and buckets.** Perform
27
+ aggregations and compute statistics on entire storage inventory using [Storage Insights
28
+ Datasets](https://cloud.google.com/storage/docs/insights/datasets)
29
+
30
+ <img src="./assets/storage_insights_aggregation.gif" width="80%" alt="Easy Access Demo">
31
+
32
+ - **Run advanced filters and searches on your data.** Search and filter your objects
33
+ by file type, size and other metadata fields using [Storage Insights
34
+ Datasets](https://cloud.google.com/storage/docs/insights/datasets)
35
+
36
+ <img src="./assets/storage_insights_filter.gif" width="80%" alt="Easy Access Demo">
37
+
26
38
  ## 🚀 Getting Started
27
39
 
28
40
  ### Prerequisites
@@ -126,21 +138,24 @@ accidental data loss.
126
138
  Safe tools are read-only or only create new objects without affecting existing
127
139
  ones. They will never modify or delete existing data in GCS.
128
140
 
129
- | Tool | Description |
130
- | :---------------------- | :------------------------------------------------------------------------------ |
131
- | `list_buckets` | Lists all buckets in a project. |
132
- | `get_bucket_metadata` | Gets comprehensive metadata for a specific bucket. |
133
- | `get_bucket_location` | Gets the location of a bucket. |
134
- | `view_iam_policy` | Views the IAM policy for a bucket. |
135
- | `check_iam_permissions` | Tests IAM permissions for a bucket. |
136
- | `create_bucket` | Creates a new bucket. Fails if the bucket already exists. |
137
- | `list_objects` | Lists objects in a GCS bucket. |
138
- | `read_object_metadata` | Reads comprehensive metadata for a specific object. |
139
- | `read_object_content` | Reads the content of a specific object. |
140
- | `download_object` | Downloads an object from GCS to a local file. |
141
- | `write_object_new` | Writes a new object. Fails if the object already exists. |
142
- | `upload_object_new` | Uploads a file to a new object. Fails if the object already exists. |
143
- | `copy_object_new` | Copies an object to a new destination. Fails if the destination already exists. |
141
+ | Tool | Description |
142
+ | :-------------------------- | :-------------------------------------------------------------------------------------------------------------------------- |
143
+ | `list_buckets` | Lists all buckets in a project. |
144
+ | `get_bucket_metadata` | Gets comprehensive metadata for a specific bucket. |
145
+ | `get_bucket_location` | Gets the location of a bucket. |
146
+ | `view_iam_policy` | Views the IAM policy for a bucket. |
147
+ | `check_iam_permissions` | Tests IAM permissions for a bucket. |
148
+ | `create_bucket` | Creates a new bucket. Fails if the bucket already exists. |
149
+ | `list_objects` | Lists objects in a GCS bucket. |
150
+ | `read_object_metadata` | Reads comprehensive metadata for a specific object. |
151
+ | `read_object_content` | Reads the content of a specific object. |
152
+ | `download_object` | Downloads an object from GCS to a local file. |
153
+ | `write_object_new` | Writes a new object. Fails if the object already exists. |
154
+ | `upload_object_new` | Uploads a file to a new object. Fails if the object already exists. |
155
+ | `copy_object_new` | Copies an object to a new destination. Fails if the destination already exists. |
156
+ | `get_metadata_table_schema` | Checks if GCS insights service is enabled and returns the BigQuery table schema for a given insights dataset configuration. |
157
+ | `execute_insights_query` | Executes a BigQuery SQL query against an insights dataset and returns the result. |
158
+ | `list_insights_configs` | Lists the names of all Storage Insights dataset configurations for a given project. |
144
159
 
145
160
  ### Destructive Tools
146
161
 
@@ -179,7 +194,7 @@ We welcome contributions! Whether you're fixing bugs, sharing feedback, or
179
194
  improving documentation, your contributions are welcome. Please read our
180
195
  [Contributing Guide](CONTRIBUTING.md) to get started.
181
196
 
182
- ## 🎬 Demo
197
+ ## 🎬 Demos
183
198
 
184
199
  <p align="center"><b>Click to watch the Storage MCP demo</b><br/>
185
200
  <a href="./assets/storage_mcp_demo.mp4" title="Click to play demo">
@@ -187,6 +202,12 @@ improving documentation, your contributions are welcome. Please read our
187
202
  </a>
188
203
  </p>
189
204
 
205
+ <p align="center"><b>Click to watch the Storage MCP demo powered by Storage Insights</b><br/>
206
+ <a href="./assets/storage_insights_demo.mp4" title="Click to play demo">
207
+ <img width="80%" alt="Storage Insights MCP Demo Video" src="./assets/storage_insights_demo_thumbnail.png">
208
+ </a>
209
+ </p>
210
+
190
211
  ## 📄 Important Notes
191
212
 
192
213
  This repository is currently in preview and may see breaking changes. This
package/dist/bundle.js CHANGED
@@ -13675,11 +13675,11 @@ var McpServer = class {
13675
13675
  this._registeredPrompts[name] = registeredPrompt;
13676
13676
  return registeredPrompt;
13677
13677
  }
13678
- _createRegisteredTool(name, title, description, inputSchema22, outputSchema, annotations, callback) {
13678
+ _createRegisteredTool(name, title, description, inputSchema25, outputSchema, annotations, callback) {
13679
13679
  const registeredTool = {
13680
13680
  title,
13681
13681
  description,
13682
- inputSchema: inputSchema22 === void 0 ? void 0 : external_exports.object(inputSchema22),
13682
+ inputSchema: inputSchema25 === void 0 ? void 0 : external_exports.object(inputSchema25),
13683
13683
  outputSchema: outputSchema === void 0 ? void 0 : external_exports.object(outputSchema),
13684
13684
  annotations,
13685
13685
  callback,
@@ -13721,7 +13721,7 @@ var McpServer = class {
13721
13721
  throw new Error(`Tool ${name} is already registered`);
13722
13722
  }
13723
13723
  let description;
13724
- let inputSchema22;
13724
+ let inputSchema25;
13725
13725
  let outputSchema;
13726
13726
  let annotations;
13727
13727
  if (typeof rest[0] === "string") {
@@ -13730,7 +13730,7 @@ var McpServer = class {
13730
13730
  if (rest.length > 1) {
13731
13731
  const firstArg = rest[0];
13732
13732
  if (isZodRawShape(firstArg)) {
13733
- inputSchema22 = rest.shift();
13733
+ inputSchema25 = rest.shift();
13734
13734
  if (rest.length > 1 && typeof rest[0] === "object" && rest[0] !== null && !isZodRawShape(rest[0])) {
13735
13735
  annotations = rest.shift();
13736
13736
  }
@@ -13739,7 +13739,7 @@ var McpServer = class {
13739
13739
  }
13740
13740
  }
13741
13741
  const callback = rest[0];
13742
- return this._createRegisteredTool(name, void 0, description, inputSchema22, outputSchema, annotations, callback);
13742
+ return this._createRegisteredTool(name, void 0, description, inputSchema25, outputSchema, annotations, callback);
13743
13743
  }
13744
13744
  /**
13745
13745
  * Registers a tool with a config object and callback.
@@ -13748,8 +13748,8 @@ var McpServer = class {
13748
13748
  if (this._registeredTools[name]) {
13749
13749
  throw new Error(`Tool ${name} is already registered`);
13750
13750
  }
13751
- const { title, description, inputSchema: inputSchema22, outputSchema, annotations } = config;
13752
- return this._createRegisteredTool(name, title, description, inputSchema22, outputSchema, annotations, cb);
13751
+ const { title, description, inputSchema: inputSchema25, outputSchema, annotations } = config;
13752
+ return this._createRegisteredTool(name, title, description, inputSchema25, outputSchema, annotations, cb);
13753
13753
  }
13754
13754
  prompt(name, ...rest) {
13755
13755
  if (this._registeredPrompts[name]) {
@@ -13851,10 +13851,16 @@ var EMPTY_COMPLETION_RESULT = {
13851
13851
  };
13852
13852
 
13853
13853
  // src/utility/api_client_factory.ts
13854
+ import { BigQuery } from "@google-cloud/bigquery";
13854
13855
  import { Storage } from "@google-cloud/storage";
13856
+ import { ServiceUsageClient } from "@google-cloud/service-usage";
13857
+ import { StorageInsightsClient } from "@google-cloud/storageinsights";
13855
13858
  var ApiClientFactory = class _ApiClientFactory {
13856
13859
  static instance;
13857
13860
  storageClient;
13861
+ serviceUsageClient;
13862
+ storageInsightsClient;
13863
+ bigqueryClient;
13858
13864
  constructor() {
13859
13865
  }
13860
13866
  static getInstance() {
@@ -13869,6 +13875,24 @@ var ApiClientFactory = class _ApiClientFactory {
13869
13875
  }
13870
13876
  return this.storageClient;
13871
13877
  }
13878
+ getServiceUsageClient() {
13879
+ if (!this.serviceUsageClient) {
13880
+ this.serviceUsageClient = new ServiceUsageClient();
13881
+ }
13882
+ return this.serviceUsageClient;
13883
+ }
13884
+ getStorageInsightsClient() {
13885
+ if (!this.storageInsightsClient) {
13886
+ this.storageInsightsClient = new StorageInsightsClient();
13887
+ }
13888
+ return this.storageInsightsClient;
13889
+ }
13890
+ getBigQueryClient() {
13891
+ if (!this.bigqueryClient) {
13892
+ this.bigqueryClient = new BigQuery();
13893
+ }
13894
+ return this.bigqueryClient;
13895
+ }
13872
13896
  };
13873
13897
  var apiClientFactory = ApiClientFactory.getInstance();
13874
13898
 
@@ -15588,6 +15612,443 @@ var registerWriteObjectSafeTool = (server) => {
15588
15612
  );
15589
15613
  };
15590
15614
 
15615
+ // src/tools/insights/get_metadata_table_schema.ts
15616
+ var serviceName = "storageinsights.googleapis.com";
15617
+ var inputSchema22 = {
15618
+ datasetConfigName: external_exports.string().describe("The name of the dataset configuration."),
15619
+ datasetConfigLocation: external_exports.string().describe("The location of the dataset configuration."),
15620
+ projectId: external_exports.string().optional().describe("The project ID to check Storage Insights availability for.")
15621
+ };
15622
+ async function getMetadataTableSchema(params) {
15623
+ const bigqueryClient = apiClientFactory.getBigQueryClient();
15624
+ const storageInsightsClient = apiClientFactory.getStorageInsightsClient();
15625
+ const serviceUsageClient = apiClientFactory.getServiceUsageClient();
15626
+ const projectId = params.projectId || process.env["GOOGLE_CLOUD_PROJECT"] || process.env["GCP_PROJECT_ID"];
15627
+ if (!projectId) {
15628
+ throw new Error(
15629
+ "Project ID not specified. Please specify via the projectId parameter or GOOGLE_CLOUD_PROJECT or GCP_PROJECT_ID environment variable."
15630
+ );
15631
+ }
15632
+ const [services] = await serviceUsageClient.listServices({
15633
+ parent: `projects/${projectId}`,
15634
+ filter: "state:ENABLED"
15635
+ });
15636
+ const isEnabled = services.some(
15637
+ (service) => service.config?.name === serviceName
15638
+ );
15639
+ if (!isEnabled) {
15640
+ throw new Error(
15641
+ `Storage Insights API is not enabled for project ${projectId}. Please enable it to proceed.`
15642
+ );
15643
+ }
15644
+ let config;
15645
+ try {
15646
+ [config] = await storageInsightsClient.getDatasetConfig({
15647
+ name: `projects/${projectId}/locations/${params.datasetConfigLocation}/datasetConfigs/${params.datasetConfigName}`
15648
+ });
15649
+ } catch (error) {
15650
+ const err = error instanceof Error ? error : void 0;
15651
+ logger.error("Error getting dataset config:", err);
15652
+ return {
15653
+ content: [
15654
+ {
15655
+ type: "text",
15656
+ text: JSON.stringify({
15657
+ error: "Failed to retrieve dataset configuration",
15658
+ details: err?.message
15659
+ })
15660
+ }
15661
+ ]
15662
+ };
15663
+ }
15664
+ const objectHints = /* @__PURE__ */ new Map([
15665
+ ["snapshotTime", "The snapshot time of the object metadata in RFC 3339 format."],
15666
+ ["bucket", "The name of the bucket containing this object."],
15667
+ ["location", "The location of the source bucket."],
15668
+ [
15669
+ "componentCount",
15670
+ "Returned for composite objects only. Number of non-composite objects in the composite object."
15671
+ ],
15672
+ ["contentDisposition", "Content-Disposition of the object data."],
15673
+ ["contentEncoding", "Content-Encoding of the object data."],
15674
+ ["contentLanguage", "Content-Language of the object data."],
15675
+ [
15676
+ "contentType",
15677
+ "Content-Type of the object data. If an object is stored without a Content-Type, it is served as application/octet-stream."
15678
+ ],
15679
+ [
15680
+ "crc32c",
15681
+ "CRC32c checksum, as described in RFC 4960, Appendix B; encoded using base64 in big-endian byte order."
15682
+ ],
15683
+ ["customTime", "A user-specified timestamp for the object in RFC 3339 format."],
15684
+ ["etag", "HTTP 1.1 Entity tag for the object."],
15685
+ ["eventBasedHold", "Whether or not the object is subject to an event-based hold."],
15686
+ ["generation", "The content generation of this object. Used for object versioning."],
15687
+ [
15688
+ "md5Hash",
15689
+ "MD5 hash of the data, encoded using base64. This field is not present for composite objects."
15690
+ ],
15691
+ ["mediaLink", "A URL for downloading the object's data."],
15692
+ ["metadata", "User-provided metadata, in key/value pairs."],
15693
+ ["metadata.key", "An individual metadata entry key."],
15694
+ ["metadata.value", "An individual metadata entry value."],
15695
+ ["metageneration", "The version of the metadata for this object at this generation."],
15696
+ ["name", "The name of the object."],
15697
+ ["selfLink", "A URL for this object."],
15698
+ ["size", "Content-Length of the data in bytes."],
15699
+ ["storageClass", "Storage class of the object."],
15700
+ ["temporaryHold", "Whether or not the object is subject to a temporary hold."],
15701
+ ["timeCreated", "The creation time of the object in RFC 3339 format."],
15702
+ [
15703
+ "timeDeleted",
15704
+ "The deletion time of the object in RFC 3339 format. Returned if and only if this version of the object is no longer a live version, but remains in the bucket as a noncurrent version."
15705
+ ],
15706
+ ["updated", "The modification time of the object metadata in RFC 3339 format."],
15707
+ ["timeStorageClassUpdated", "The time at which the object's storage class was last changed."],
15708
+ [
15709
+ "retentionExpirationTime",
15710
+ "The earliest time that the object can be deleted, in RFC 3339 format."
15711
+ ],
15712
+ [
15713
+ "softDeleteTime",
15714
+ "If this object has been soft-deleted, this is the time at which it became soft-deleted."
15715
+ ],
15716
+ [
15717
+ "hardDeleteTime",
15718
+ "This is the time (in the future) when the object will no longer be restorable."
15719
+ ],
15720
+ ["project", "The project number of the project the bucket belongs to."]
15721
+ ]);
15722
+ const bucketHints = /* @__PURE__ */ new Map([
15723
+ ["snapshotTime", "The snapshot time of the metadata in RFC 3339 format."],
15724
+ ["name", "The name of the source bucket."],
15725
+ ["location", 'The location of the source bucket (e.g., "US", "EU", "ASIA-EAST1").'],
15726
+ ["project", "The project number of the project the bucket belongs to."],
15727
+ [
15728
+ "storageClass",
15729
+ `The bucket's default storage class (e.g., "STANDARD", "NEARLINE", "COLDLINE").`
15730
+ ],
15731
+ [
15732
+ "public.bucketPolicyOnly",
15733
+ "Deprecated field. Whether to enforcement uniform bucket-level access. This concept is now represented by iamConfiguration.uniformBucketLevelAccess.enabled."
15734
+ ],
15735
+ [
15736
+ "public.publicAccessPrevention",
15737
+ `The bucket's public access prevention status ("inherited" or "enforced"). This is the same setting as iamConfiguration.publicAccessPrevention.`
15738
+ ],
15739
+ ["autoclass.enabled", "Whether Autoclass is enabled for the bucket."],
15740
+ ["autoclass.toggleTime", "The time Autoclass was last enabled or disabled."],
15741
+ ["versioning", "Boolean indicating if Object Versioning is enabled for the bucket."],
15742
+ [
15743
+ "lifecycle",
15744
+ "Boolean indicating if the bucket has an Object Lifecycle Management configuration."
15745
+ ],
15746
+ ["metageneration", "The metadata generation of this bucket."],
15747
+ [
15748
+ "timeCreated",
15749
+ "The creation time of the bucket in RFC 3339 format. To perform date calculations, use DATE_SUB or DATE_ADD with CURRENT_DATE()"
15750
+ ],
15751
+ ["tags.tagMap.key", "The key of a tag."],
15752
+ ["tags.tagMap.value", "The value of a tag."],
15753
+ ["tags.lastUpdatedTime", "The last updated time for the tags."],
15754
+ ["labels.key", "An individual label entry key."],
15755
+ ["labels.value", "An individual label entry value."],
15756
+ [
15757
+ "softDeletePolicy.retentionDurationSeconds",
15758
+ "The duration in seconds that soft-deleted objects will be retained."
15759
+ ],
15760
+ [
15761
+ "softDeletePolicy.effectiveTime",
15762
+ "The time from which the soft delete policy became effective."
15763
+ ],
15764
+ [
15765
+ "iamConfiguration.uniformBucketLevelAccess.enabled",
15766
+ "If True, Uniform bucket-level access is enabled, disabling object-level ACLs. This replaces the legacy public.bucketPolicyOnly field."
15767
+ ],
15768
+ [
15769
+ "iamConfiguration.publicAccessPrevention",
15770
+ `The bucket's public access prevention status ("inherited" or "enforced"). This is the same setting as public.publicAccessPrevention.`
15771
+ ],
15772
+ [
15773
+ "resourceTags",
15774
+ "This field appears to be redundant. Bucket resource tags are properly represented under the tags field."
15775
+ ],
15776
+ [
15777
+ "objectCount",
15778
+ "Total number of objects in the bucket. This is a recent addition for aggregated bucket metrics."
15779
+ ],
15780
+ [
15781
+ "totalSize",
15782
+ "Total size of the bucket in bytes. This is a recent addition for aggregated bucket metrics."
15783
+ ]
15784
+ ]);
15785
+ try {
15786
+ const linkedDataset = config.link?.dataset;
15787
+ if (linkedDataset) {
15788
+ const parts = linkedDataset.split("/");
15789
+ const datasetId = parts[parts.length - 1];
15790
+ if (!datasetId) {
15791
+ throw new Error("Could not extract dataset ID from linked dataset.");
15792
+ }
15793
+ const bucketViewId = "bucket_attributes_latest_snapshot_view";
15794
+ const objectViewId = "object_attributes_latest_snapshot_view";
15795
+ const [bucketViewMetadata] = await bigqueryClient.dataset(datasetId).table(bucketViewId).getMetadata();
15796
+ const [objectViewMetadata] = await bigqueryClient.dataset(datasetId).table(objectViewId).getMetadata();
15797
+ const bucketViewFields = bucketViewMetadata.schema.fields.map(
15798
+ (field) => {
15799
+ const fieldWithHint = { ...field };
15800
+ if (field.name && bucketHints.has(field.name)) {
15801
+ fieldWithHint.hint = bucketHints.get(field.name);
15802
+ }
15803
+ return fieldWithHint;
15804
+ }
15805
+ );
15806
+ const objectViewFields = objectViewMetadata.schema.fields.map(
15807
+ (field) => {
15808
+ const fieldWithHint = { ...field };
15809
+ if (field.name && objectHints.has(field.name)) {
15810
+ fieldWithHint.hint = objectHints.get(field.name);
15811
+ }
15812
+ return fieldWithHint;
15813
+ }
15814
+ );
15815
+ const result = {
15816
+ [`${datasetId}.${bucketViewId}`]: bucketViewFields,
15817
+ [`${datasetId}.${objectViewId}`]: objectViewFields,
15818
+ ...config
15819
+ };
15820
+ return {
15821
+ content: [
15822
+ {
15823
+ type: "text",
15824
+ text: JSON.stringify(result)
15825
+ }
15826
+ ]
15827
+ };
15828
+ }
15829
+ throw new Error("Configuration does not have a linked dataset.");
15830
+ } catch (error) {
15831
+ const err = error instanceof Error ? error : void 0;
15832
+ logger.error("Error getting metadata table schema:", err);
15833
+ return {
15834
+ content: [
15835
+ {
15836
+ type: "text",
15837
+ text: JSON.stringify({
15838
+ error: "Failed to get metadata table schema",
15839
+ details: err?.message
15840
+ })
15841
+ }
15842
+ ]
15843
+ };
15844
+ }
15845
+ }
15846
+ var registerGetMetadataTableSchemaTool = (server) => {
15847
+ server.registerTool(
15848
+ "get_metadata_table_schema",
15849
+ {
15850
+ description: "Checks if GCS insights service is enabled and returns the BigQuery table schema for a given insights dataset configuration in JSON format. Also returns hints for each column in the table",
15851
+ inputSchema: inputSchema22,
15852
+ annotations: {
15853
+ displayOutput: false
15854
+ }
15855
+ },
15856
+ getMetadataTableSchema
15857
+ );
15858
+ };
15859
+
15860
+ // src/tools/insights/execute_insights_query.ts
15861
+ var inputSchema23 = {
15862
+ config: external_exports.string().describe(
15863
+ "The JSON object of the BigQuery table schema for a given insights dataset configuration."
15864
+ ),
15865
+ query: external_exports.string().describe("The BigQuery SQL query to execute."),
15866
+ jobTimeoutMs: external_exports.number().optional().default(2e4).describe("The maximum amount of time for the job to run on the server.")
15867
+ };
15868
+ async function executeInsightsQuery(params) {
15869
+ const bigqueryClient = apiClientFactory.getBigQueryClient();
15870
+ try {
15871
+ let config;
15872
+ try {
15873
+ config = JSON.parse(params.config);
15874
+ } catch (_e) {
15875
+ throw new Error("Invalid configuration provided. Expected a JSON object or a JSON string.");
15876
+ }
15877
+ if (typeof config !== "object" || config === null) {
15878
+ throw new Error("Invalid configuration provided. Expected a JSON object.");
15879
+ }
15880
+ const linkedDataset = config.link?.dataset;
15881
+ if (!linkedDataset) {
15882
+ throw new Error("The provided configuration is missing the `link.dataset` property.");
15883
+ }
15884
+ const nameParts = config.name?.split("/");
15885
+ if (!nameParts || nameParts.length < 4) {
15886
+ throw new Error(
15887
+ "Invalid configuration name format. Expected `projects/{projectId}/locations/{locationId}/datasetConfigs/{datasetConfigId}`."
15888
+ );
15889
+ }
15890
+ const projectId = nameParts[1];
15891
+ const datasetId = linkedDataset.split("/").pop();
15892
+ const location = nameParts[3];
15893
+ if (!location) {
15894
+ throw new Error("Could not extract location from the configuration name.");
15895
+ }
15896
+ if (!datasetId) {
15897
+ throw new Error("Could not extract datasetId from the linked dataset.");
15898
+ }
15899
+ const baseQueryOptions = {
15900
+ query: params.query,
15901
+ jobTimeoutMs: params.jobTimeoutMs,
15902
+ location
15903
+ };
15904
+ const options = {};
15905
+ if (projectId) {
15906
+ options.projectId = projectId;
15907
+ }
15908
+ logger.info(`Executing query with location: ${location}`);
15909
+ logger.info(`Executing query with datasetId: ${datasetId}`);
15910
+ logger.info(`Executing query with projectId: ${projectId}`);
15911
+ logger.info("Performing BigQuery dry run...");
15912
+ try {
15913
+ const [dryRunJob] = await bigqueryClient.dataset(datasetId, options).createQueryJob({
15914
+ ...baseQueryOptions,
15915
+ dryRun: true
15916
+ });
15917
+ logger.info(`Dry run successful for query. Job ID: ${dryRunJob.id}`);
15918
+ } catch (error) {
15919
+ const err = error;
15920
+ logger.error("BigQuery dry run failed:", err);
15921
+ return {
15922
+ content: [
15923
+ {
15924
+ type: "text",
15925
+ text: JSON.stringify({
15926
+ error: "Validation failed: Invalid BigQuery SQL or access error during dry run",
15927
+ error_type: "QueryValidationError",
15928
+ details: err?.message
15929
+ })
15930
+ }
15931
+ ]
15932
+ };
15933
+ }
15934
+ logger.info("Dry run passed. Executing BigQuery query...");
15935
+ const [job] = await bigqueryClient.dataset(datasetId, options).createQueryJob(baseQueryOptions);
15936
+ logger.info(`Job ${job.id} started.`);
15937
+ const [rows] = await job.getQueryResults();
15938
+ logger.info(`Successfully executed query.`);
15939
+ return {
15940
+ content: [
15941
+ {
15942
+ type: "text",
15943
+ text: JSON.stringify(rows)
15944
+ }
15945
+ ]
15946
+ };
15947
+ } catch (error) {
15948
+ const err = error;
15949
+ logger.error("Error executing insights query:", err);
15950
+ let errorType = "Unknown";
15951
+ if (err.message.includes("Job timed out")) {
15952
+ errorType = "Timeout";
15953
+ }
15954
+ return {
15955
+ content: [
15956
+ {
15957
+ type: "text",
15958
+ text: JSON.stringify({
15959
+ error: "Failed to execute insights query",
15960
+ error_type: errorType,
15961
+ details: err?.message
15962
+ })
15963
+ }
15964
+ ]
15965
+ };
15966
+ }
15967
+ }
15968
+ var registerExecuteInsightsQueryTool = (server) => {
15969
+ server.registerTool(
15970
+ "execute_insights_query",
15971
+ {
15972
+ description: "Executes a BigQuery SQL query against an insights dataset and returns the result.",
15973
+ inputSchema: inputSchema23
15974
+ },
15975
+ executeInsightsQuery
15976
+ );
15977
+ };
15978
+
15979
+ // src/tools/insights/list_insights_configs.ts
15980
+ var serviceName2 = "storageinsights.googleapis.com";
15981
+ var inputSchema24 = {
15982
+ projectId: external_exports.string().optional().describe("The project ID to list Storage Insights dataset configurations for.")
15983
+ };
15984
+ async function listInsightsConfigs(params) {
15985
+ const storageInsightsClient = apiClientFactory.getStorageInsightsClient();
15986
+ const serviceUsageClient = apiClientFactory.getServiceUsageClient();
15987
+ const projectId = params.projectId || process.env["GOOGLE_CLOUD_PROJECT"] || process.env["GCP_PROJECT_ID"];
15988
+ if (!projectId) {
15989
+ throw new Error(
15990
+ "Project ID not specified. Please specify via the projectId parameter or GOOGLE_CLOUD_PROJECT or GCP_PROJECT_ID environment variable."
15991
+ );
15992
+ }
15993
+ const [services] = await serviceUsageClient.listServices({
15994
+ parent: `projects/${projectId}`,
15995
+ filter: "state:ENABLED"
15996
+ });
15997
+ const isEnabled = services.some(
15998
+ (service) => service.config?.name === serviceName2
15999
+ );
16000
+ if (!isEnabled) {
16001
+ throw new Error(
16002
+ `Storage Insights API is not enabled for project ${projectId}. Please enable it to proceed.`
16003
+ );
16004
+ }
16005
+ try {
16006
+ const parent = `projects/${projectId}/locations/-`;
16007
+ const iterable = storageInsightsClient.listDatasetConfigsAsync({ parent });
16008
+ const configNames = [];
16009
+ for await (const config of iterable) {
16010
+ if (config.name) {
16011
+ configNames.push(config.name);
16012
+ }
16013
+ }
16014
+ logger.info(`Successfully listed ${configNames.length} dataset config names.`);
16015
+ return {
16016
+ content: [
16017
+ {
16018
+ type: "text",
16019
+ text: JSON.stringify({
16020
+ configurations: configNames
16021
+ })
16022
+ }
16023
+ ]
16024
+ };
16025
+ } catch (error) {
16026
+ const err = error instanceof Error ? error : void 0;
16027
+ logger.error("Error listing dataset configs:", err);
16028
+ return {
16029
+ content: [
16030
+ {
16031
+ type: "text",
16032
+ text: JSON.stringify({
16033
+ error: "Failed to list dataset configurations",
16034
+ details: err?.message
16035
+ })
16036
+ }
16037
+ ]
16038
+ };
16039
+ }
16040
+ }
16041
+ var registerListInsightsConfigsTool = (server) => {
16042
+ server.registerTool(
16043
+ "list_insights_configs",
16044
+ {
16045
+ description: "Lists the names of all Storage Insights dataset configurations for a given project.",
16046
+ inputSchema: inputSchema24
16047
+ },
16048
+ listInsightsConfigs
16049
+ );
16050
+ };
16051
+
15591
16052
  // src/tools/index.ts
15592
16053
  var commonSafeTools = [
15593
16054
  registerListBucketsTool,
@@ -15600,7 +16061,10 @@ var commonSafeTools = [
15600
16061
  registerReadObjectContentTool,
15601
16062
  registerReadObjectMetadataTool,
15602
16063
  registerDownloadObjectTool,
15603
- registerDeleteObjectTool
16064
+ registerDeleteObjectTool,
16065
+ registerGetMetadataTableSchemaTool,
16066
+ registerExecuteInsightsQueryTool,
16067
+ registerListInsightsConfigsTool
15604
16068
  ];
15605
16069
  var safeWriteTools = [
15606
16070
  registerWriteObjectSafeTool,
@@ -15717,7 +16181,7 @@ var StdioServerTransport = class {
15717
16181
  // package.json
15718
16182
  var package_default = {
15719
16183
  name: "@google-cloud/storage-mcp",
15720
- version: "0.2.0",
16184
+ version: "0.3.0",
15721
16185
  type: "module",
15722
16186
  main: "dist/bundle.js",
15723
16187
  bin: {
@@ -15777,7 +16241,10 @@ var package_default = {
15777
16241
  vitest: "^3.2.4"
15778
16242
  },
15779
16243
  dependencies: {
16244
+ "@google-cloud/bigquery": "^7.0.0",
16245
+ "@google-cloud/service-usage": "^4.2.0",
15780
16246
  "@google-cloud/storage": "^7.17.1",
16247
+ "@google-cloud/storageinsights": "^2.2.0",
15781
16248
  "@modelcontextprotocol/sdk": "^1.17.1",
15782
16249
  "@types/yargs": "^17.0.33",
15783
16250
  chardet: "^2.1.0",
@@ -21141,6 +21608,18 @@ For example, you can use the tools to:
21141
21608
 
21142
21609
  - **Clarify Ambiguity:** Do not guess or assume values for required parameters like bucket names. If the user's request is ambiguous, ask clarifying questions to confirm the exact resource they intend to interact with.
21143
21610
  - **Use Defaults:** If a \`project_id\` is not specified by the user, you can use the default value configured in the environment.
21611
+ - **Terminology Grounding:** When users ask for "insights datasets," they are referring to BigQuery datasets populated by Storage Insights Configurations.
21612
+ Do NOT confuse this with legacy "Inventory Reports" (which export CSV/Parquet files to GCS).
21613
+ - **Insights Intent Identification:** There is a set of 3 tools called Insights tool which you have to use whenever you detect an Storage Insights intent.
21614
+ To identify "Storage Insights Intent" queries, look for prompts focused on analyzing Google Cloud Storage configuration metadata. These requests typically stem from cost optimization, security auditing, or data governance needs. Users ask to aggregate, filter, or or analyze "buckets" and "objects" based on attributes like "size," "age," "location," and "storage class." Key indicators include checks on configurations such as "public access prevention," "versioning," and "lifecycle policies." The queries often involve complex operations, including statistical analysis (correlations, percentiles), ranking, and filtering based on user-defined "tags" or "labels" across projects and regions.
21615
+ - **Insights Intent Handling:** Queries identified to have a "Storage Insights Intent" MUST be handled by generating BigQuery SQL queries against Storage Insights datasets. Example: "What is the total size of my storage?", "Which bucket has the most objects?", "Show the distribution of my storage classes."
21616
+ If the intent is classified as " Storage Insights Intent", proceed to execute the below steps in order:
21617
+
21618
+ 1. Check if the user provided the \`config_name\` and \`config_location\` of the dataset configuration. If not then ask the user for the \`config_name\` and \`config_location\` of the dataset configuration they would like to use and remember it.
21619
+ 2. If the dataset configuration is available then pass this dataset configuration name and location to get_metadata_table_schema which will return the schema with some hints. If the dataset config name is returned as incorrect, call the list_insights_configs tool and then ask the user to select the correct dataset configuration name and location again and don't list the available configs unless user explicitly asks for it and retry getting the metadata table schema. Remember the schema for the remaining session unless user asks to change the dataset.
21620
+ 3. Once you have the dataset table schema, use it to draft query/queries and call the execute_insights_query tool get relevant data. If the query fails due to some reason, correct it and retry.
21621
+ **Note on BigQuery Table References:** When constructing BigQuery SQL queries, ensure that table references are fully qualified with the project ID. The format should be \`project_id.dataset_id.table_id\`. For example, if the project ID is \`my-gcp-project\`, the dataset ID is \`my_dataset\`, and the table ID is \`my_table\`, the reference in the query should be \`my-gcp-project.my_dataset.my_table\`.
21622
+ 4. Based on the query results, answer the users query.
21144
21623
 
21145
21624
  ## GCS Reference Documentation
21146
21625