data-science-document-ai 1.60.2__tar.gz → 1.61.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/PKG-INFO +1 -1
  2. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/pyproject.toml +1 -1
  3. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/postprocessing/common.py +11 -0
  4. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bundeskasse/other/placeholders.json +2 -2
  5. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bundeskasse/other/prompt.txt +3 -2
  6. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/customsInvoice/other/placeholders.json +14 -5
  7. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/customsInvoice/other/prompt.txt +11 -4
  8. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/draftMbl/other/prompt.txt +1 -1
  9. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/finalMbL/other/prompt.txt +1 -1
  10. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/partnerInvoice/other/placeholders.json +14 -5
  11. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/partnerInvoice/other/prompt.txt +10 -3
  12. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/constants.py +0 -0
  13. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/constants_sandbox.py +0 -0
  14. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/docai.py +0 -0
  15. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/docai_processor_config.yaml +0 -0
  16. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/excel_processing.py +0 -0
  17. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/io.py +0 -0
  18. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/llm.py +0 -0
  19. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/log_setup.py +0 -0
  20. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/pdf_processing.py +0 -0
  21. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/postprocessing/postprocess_booking_confirmation.py +0 -0
  22. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/postprocessing/postprocess_commercial_invoice.py +0 -0
  23. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/postprocessing/postprocess_partner_invoice.py +0 -0
  24. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/arrivalNotice/other/placeholders.json +0 -0
  25. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/arrivalNotice/other/prompt.txt +0 -0
  26. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/evergreen/placeholders.json +0 -0
  27. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/evergreen/prompt.txt +0 -0
  28. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/hapag-lloyd/placeholders.json +0 -0
  29. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/hapag-lloyd/prompt.txt +0 -0
  30. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/maersk/placeholders.json +0 -0
  31. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/maersk/prompt.txt +0 -0
  32. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/msc/placeholders.json +0 -0
  33. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/msc/prompt.txt +0 -0
  34. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/oocl/placeholders.json +0 -0
  35. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/oocl/prompt.txt +0 -0
  36. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/other/placeholders.json +0 -0
  37. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/other/prompt.txt +0 -0
  38. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/yangming/placeholders.json +0 -0
  39. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/bookingConfirmation/yangming/prompt.txt +0 -0
  40. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/commercialInvoice/other/placeholders.json +0 -0
  41. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/commercialInvoice/other/prompt.txt +0 -0
  42. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/customsAssessment/other/placeholders.json +0 -0
  43. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/customsAssessment/other/prompt.txt +0 -0
  44. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/deliveryOrder/other/placeholders.json +0 -0
  45. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/deliveryOrder/other/prompt.txt +0 -0
  46. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/draftMbl/other/placeholders.json +0 -0
  47. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/finalMbL/other/placeholders.json +0 -0
  48. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/packingList/other/placeholders.json +0 -0
  49. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/packingList/other/prompt.txt +0 -0
  50. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/postprocessing/port_code/placeholders.json +0 -0
  51. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/postprocessing/port_code/prompt_port_code.txt +0 -0
  52. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/preprocessing/carrier/placeholders.json +0 -0
  53. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/preprocessing/carrier/prompt.txt +0 -0
  54. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/shippingInstruction/other/placeholders.json +0 -0
  55. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/library/shippingInstruction/other/prompt.txt +0 -0
  56. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/prompts/prompt_library.py +0 -0
  57. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/setup.py +0 -0
  58. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/tms.py +0 -0
  59. {data_science_document_ai-1.60.2 → data_science_document_ai-1.61.0}/src/utils.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: data-science-document-ai
3
- Version: 1.60.2
3
+ Version: 1.61.0
4
4
  Summary: "Document AI repo for data science"
5
5
  Author: Naomi Nguyen
6
6
  Author-email: naomi.nguyen@forto.com
@@ -1,6 +1,6 @@
1
1
  [tool.poetry]
2
2
  name = "data-science-document-ai"
3
- version = "1.60.2"
3
+ version = "1.61.0"
4
4
  description = "\"Document AI repo for data science\""
5
5
  authors = ["Naomi Nguyen <naomi.nguyen@forto.com>", "Kumar Rajendrababu <kumar.rajendrababu@forto.com>", "Igor Tonko <igor.tonko@forto.com>", "Osman Demirel <osman.demirel@forto.com>"]
6
6
  packages = [
@@ -723,6 +723,17 @@ async def format_all_entities(result, document_type_code, params, mime_type):
723
723
  if document_type_code in ["partnerInvoice", "bundeskasse"]:
724
724
  await process_partner_invoice(params, aggregated_data, document_type_code)
725
725
 
726
+ # TODO: This is a temporary change until the terminal codes are updated
727
+ if document_type_code == "bookingConfirmation":
728
+ if "gateInTerminalCode" in aggregated_data:
729
+ aggregated_data["gateInTerminal"] = aggregated_data.pop(
730
+ "gateInTerminalCode"
731
+ )
732
+ if "pickUpTerminalCode" in aggregated_data:
733
+ aggregated_data["pickUpTerminal"] = aggregated_data.pop(
734
+ "pickUpTerminalCode"
735
+ )
736
+
726
737
  logger.info("Data Extraction completed successfully")
727
738
  return aggregated_data
728
739
 
@@ -93,14 +93,14 @@
93
93
  "invoiceNumber": {
94
94
  "type": "STRING",
95
95
  "nullable": true,
96
- "description": "Invoice Number is a unique identifier for the invoice, it starts with \"ATC\", \"AT-C\", or \"AT/C\" only (e.g., ATC40, AT-C-40-, AT/C/40/....). Do NOT extract \"NIZZA-Registrierkennzeichen number."
96
+ "description": "Invoice Number is a unique identifier for the invoice, it starts with ATC, AT-C, or AT/C only (e.g., ATC40, AT-C-40-, AT/C/40/....) It can be found just below the title of the invoice or in the top section of the invoice. Do NOT extract NIZZA-Registrierkennzeichen number (e.g. ATC0040M00...)."
97
97
  },
98
98
  "containerNumber": {
99
99
  "type": "ARRAY",
100
100
  "items": {
101
101
  "type": "STRING",
102
102
  "nullable": true,
103
- "description": "The unique identifier for each container. It always starts with 4 capital letters and followed by 7 digits. Example: TEMU7972458."
103
+ "description": "The unique identifier for each container. It always starts with 4 capital letters and followed by 7 digits. Example: TEMU7972458. Do not get confused between 0 vs O in the 7 digits of container number."
104
104
  }
105
105
  },
106
106
  "creditNoteInvoiceNumber": {
@@ -16,7 +16,7 @@ Your role is to accurately extract specific entities from these Customs invoices
16
16
  - The amount and the currency is always in EUR both for grandTotal and line items.
17
17
 
18
18
  - containerNumber:
19
- - Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU7222892).
19
+ - Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU7222892). Do not get confused between 0 vs O in the 7 digits of container number.
20
20
  - Few invoices contains multiple container numbers, in that case, all container numbers should be captured.
21
21
 
22
22
  - shipmentID:
@@ -26,7 +26,8 @@ Your role is to accurately extract specific entities from these Customs invoices
26
26
 
27
27
  - invoiceNumber:
28
28
  - Invoice Number is a unique identifier for the invoice, it starts with "ATC", "AT-C", or "AT/C" only (e.g., ATC40..., AT-C-40-..., AT/C/40/....).
29
- - Do NOT extract if the text is about vehicle registrations, license plates, or location identifiers (e.g., "NIZZA-Registrierkennzeichen: ATC0040....")
29
+ - It can be found just below the title of the invoice or in the top section of the invoice.
30
+ - Do NOT extract if the text is about vehicle registrations, license plates, or location identifiers (e.g., "NIZZA-Registrierkennzeichen: ATC0040M00....")
30
31
 
31
32
  - creditNoteInvoiceNumber:
32
33
  - Credit Note Invoice Number is a unique identifier for the credit note, it starts with "ATS" only (e.g., ATS.....).
@@ -75,25 +75,34 @@
75
75
  },
76
76
  "lineItemDescription": {"type": "STRING", "nullable": true,
77
77
  "description": "A description of the line item (COGS or Customs line items), which can include details about the service provided."},
78
+ "totalAmountCurrency": {"type": "STRING", "nullable": true,
79
+ "description": "The currency code for the line item's total amount, such as EUR, USD, etc."},
78
80
  "totalAmount": {"type": "STRING", "nullable": true,
79
81
  "description": "The total amount for the line item, which may include the cost of services, and applicable taxes."},
80
- "totalAmountCurrency": {"type": "STRING", "nullable": true,
81
- "description": "The currency code for the total amount, such as EUR, USD, etc."},
82
+ "totalAmountInOriginalCurrency": {
83
+ "type": "STRING",
84
+ "description": "The total amount for the line item in original currency"
85
+ },
86
+ "totalAmountInInvoiceCurrency": {
87
+ "type": "STRING",
88
+ "description": "The total amount for the line item in the currency code the whole invoice's total uses"
89
+ },
82
90
  "totalAmountEuro": {"type": "STRING", "nullable": true,
83
91
  "description": "The total amount converted to Euro, if applicable. You can find it by looking for the term 'Total EUR' or 'Amount in Euro' in the line item."},
84
92
  "quantity": {"type": "STRING", "nullable": true,
85
93
  "description": "The quantity of the item or service provided in the line item."},
86
- "unitPrice": {"type": "STRING", "nullable": true,
87
- "description": "The price per unit of the item or service in the line item. Check the naming in a different languages, such as 'Einzelpreis', 'Unit Price', 'Prezzo unitario', 'Preis pro Einheit', etc.. Refer to 'Prezzo unitario' field in the italian invoice example"},
88
94
  "unitPriceCurrency": {"type": "STRING", "nullable": true,
89
95
  "description": "The currency code for the unit price, such as EUR, USD, etc."},
96
+ "unitPrice": {"type": "STRING", "nullable": true,
97
+ "description": "The price per unit of the item or service in the line item. Check the naming in a different languages, such as 'Einzelpreis', 'Unit Price', 'Prezzo unitario', 'Preis pro Einheit', etc.. Refer to 'Prezzo unitario' field in the italian invoice example"},
90
98
  "vatAmount": {"type": "STRING", "nullable": true,
91
99
  "description": "The VAT amount applied to the line item. This is the tax charged on the totalAmount of the line item."},
92
100
  "vatPercentage": {"type": "STRING", "nullable": true,
93
101
  "description": "The percentage rate of VAT applied to the totalAmount of the line item. This is used to calculate the vatAmount."
94
102
  },
95
103
  "containerNumber": {"type": "STRING", "nullable": true,
96
- "description": "The container number associated with the line item. containerNumber MUST start with 4 letters followed by 7 digits (e.g., CMAU1234567)"},
104
+ "description": "The container number associated with the line item. containerNumber MUST start with 4 letters followed by 7 digits (e.g., CMAU1234567). Do not get confused between 0 vs O in the 7 digits of container number."
105
+ },
97
106
  "containerSize": {"type": "STRING", "nullable": true,
98
107
  "description": "The size of the container associated with the containerNumber, such as 20ft, 40ft, 40HC, 20DC etc."}
99
108
  }
@@ -55,11 +55,18 @@ Your role is to accurately extract specific entities from these invoices to supp
55
55
  - lineItem: Details of each COGS and Customs line item on the invoice from each page. Make sure to extract each amount and currency separately.
56
56
  - uniqueId: A unique id which associated with the lineItem as each line item can belong to a different shipment. Extract only if its available in the line item. Either a shipmentId starting with an S and followed by 6 or 8 numeric values or a mblNumber. If shipmentId or mblNumber does not exist, set it to containerNumber.
57
57
  - lineItemDescription: The name or description of the item. Usually, it will be a one line sentence.
58
- - unitPrice: Even if the quantity is not mentioned, you can still extract the unit price. Check the naming of the columns in a different languages, it can be "Unit Price", "Prezzo unitario", "Prix Unitaire", "Unitario", etc. Refer to "Prezzo unitario" field in the italian invoice example.
59
- - totalAmount: The total amount for the item. It can be in different currencies, so ensure to capture the currency as well for the totalAmountCurrency.
58
+ - totalAmountCurrency: the original/native currency code for the line item's total amount. In 3 letters such as USD, EUR
59
+ - totalAmount: The total amount for the item. It can be in different currencies, so ensure to capture the amount according to the totalAmountCurrency.
60
+ - totalAmountInOriginalCurrency: line item amount in its original/native currency
61
+ - totalAmountInInvoiceCurrency: line item amount converted to the invoice's main currency (the currency used for grandTotal)
60
62
  - totalAmountEuro: Few line items contains a total amount in Euro. You can find it by looking for the term "Total EUR" or "Amount in Euro" in the line item but it's always in the EURO / € currency. Sometimes, it can be same as totalAmount if the line item is already in Euro.
61
- - quantity: The quantity of the item or service provided in the line item. Pay attention to 2 x 40HC or 2x40HC. It means, quantity is 2 and 40HC is containerSize but not 240.
62
- - containerNumber: Container Number always starts with 4 letters and is followed by 7 digits (e.g., ABCD1234567).
63
+ - quantity: The quantity of the item or service provided in the line item. Pay attention to 2 x 40HC or 2x40HC. It means, quantity is 2 and 40HC is containerSize
64
+ - unitPriceCurrency: The original/native currency code for the unit price. 3 letters
65
+ - unitPrice: Even if the quantity is not mentioned, you can still extract the unit price. Check the naming of the columns in a different languages, it can be "Unit Price", "Prezzo unitario", "Prix Unitaire", "Unitario", etc. Refer to "Prezzo unitario" field in the italian invoice example.
66
+ - vatAmount: The VAT amount applied to the line item. This is the tax charged on the totalAmount of the line item
67
+ - vatPercentage: The percentage rate of VAT applied to the totalAmount of the line item. This is used to calculate the vatAmount
68
+ - containerNumber: Container Number always starts with 4 letters and is followed by 7 digits (e.g., ABCD1234567, XALU 8593678).
69
+ - containerSize: The size of the container associated with the containerNumber, such as 20ft, 40ft, 40HC, 20DC etc.
63
70
 
64
71
  - hblNumber and mblNumber:
65
72
  - The Master Bill of Lading number. Commonly known as "Bill of Lading Number", "BILL OF LADING NO.", "BL Number", "BL No.", "B/L No.", "BL-Nr.", "B/L", or "HBL No.".
@@ -28,7 +28,7 @@ Your role is to accurately extract specific entities from these draftMBLs to sup
28
28
  - Vessel Name is the name of the ship carrying the cargo. It can be referred to as "Vessel", "Ship Name", "Schiff", "Schiffsname", "Nave", or "Vessel/Flight No.".
29
29
 
30
30
  - containers: Details of each container on the draftMBL. Make sure to extract each container information separately.
31
- - containerNumber: Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU 7222892).
31
+ - containerNumber: Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU 7222892). Do not get confused between 0 vs O in the 7 digits of container number.
32
32
  - sealNumber: Seal numbers are unique identifiers for shipping seals. They are usually mentioned as seal numbers in the document but they are definitely not container numbers.
33
33
 
34
34
  <INSTRUCTIONS>
@@ -28,7 +28,7 @@ Your role is to accurately extract specific entities from these finalMBLs to sup
28
28
  - Vessel Name is the name of the ship carrying the cargo. It can be referred to as "Vessel", "Ship Name", "Schiff", "Schiffsname", "Nave", or "Vessel/Flight No.".
29
29
 
30
30
  - containers: Details of each container on the finalMBL. Make sure to extract each container information separately.
31
- - containerNumber: Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU 7222892).
31
+ - containerNumber: Container Number consists of 4 capital letters followed by 7 digits (e.g., TEMU7972458, CAIU 7222892). Do not get confused between 0 vs O in the 7 digits of container number.
32
32
  - sealNumber: Seal numbers are unique identifiers for shipping seals. They are usually mentioned as seal numbers in the document but they are definitely not container numbers.
33
33
 
34
34
  <INSTRUCTIONS>
@@ -73,25 +73,34 @@
73
73
  },
74
74
  "lineItemDescription": {"type": "STRING", "nullable": true,
75
75
  "description": "A description of the line item (COGS or Customs line items), which can include details about the service provided."},
76
+ "totalAmountCurrency": {"type": "STRING", "nullable": true,
77
+ "description": "The currency code for the line item's total amount, such as EUR, USD, etc."},
76
78
  "totalAmount": {"type": "STRING", "nullable": true,
77
79
  "description": "The total amount for the line item, which may include the cost of services, and applicable taxes."},
78
- "totalAmountCurrency": {"type": "STRING", "nullable": true,
79
- "description": "The currency code for the total amount, such as EUR, USD, etc."},
80
+ "totalAmountInOriginalCurrency": {
81
+ "type": "STRING",
82
+ "description": "The total amount for the line item in original currency"
83
+ },
84
+ "totalAmountInInvoiceCurrency": {
85
+ "type": "STRING",
86
+ "description": "The total amount for the line item in the currency code the whole invoice's total uses"
87
+ },
80
88
  "totalAmountEuro": {"type": "STRING", "nullable": true,
81
89
  "description": "The total amount converted to Euro, if applicable. You can find it by looking for the term 'Total EUR' or 'Amount in Euro' in the line item."},
82
90
  "quantity": {"type": "STRING", "nullable": true,
83
91
  "description": "The quantity of the item or service provided in the line item."},
84
- "unitPrice": {"type": "STRING", "nullable": true,
85
- "description": "The price per unit of the item or service in the line item. Check the naming in a different languages, such as 'Einzelpreis', 'Unit Price', 'Prezzo unitario', 'Preis pro Einheit', etc.. Refer to 'Prezzo unitario' field in the italian invoice example"},
86
92
  "unitPriceCurrency": {"type": "STRING", "nullable": true,
87
93
  "description": "The currency code for the unit price, such as EUR, USD, etc."},
94
+ "unitPrice": {"type": "STRING", "nullable": true,
95
+ "description": "The price per unit of the item or service in the line item. Check the naming in a different languages, such as 'Einzelpreis', 'Unit Price', 'Prezzo unitario', 'Preis pro Einheit', etc.. Refer to 'Prezzo unitario' field in the italian invoice example"},
88
96
  "vatAmount": {"type": "STRING", "nullable": true,
89
97
  "description": "The VAT amount applied to the line item. This is the tax charged on the totalAmount of the line item."},
90
98
  "vatPercentage": {"type": "STRING", "nullable": true,
91
99
  "description": "The percentage rate of VAT applied to the totalAmount of the line item. This is used to calculate the vatAmount."
92
100
  },
93
101
  "containerNumber": {"type": "STRING", "nullable": true,
94
- "description": "The container number associated with the line item. containerNumber MUST start with 4 letters followed by 7 digits (e.g., CMAU1234567)"},
102
+ "description": "The container number associated with the line item. containerNumber MUST start with 4 letters followed by 7 digits (e.g., CMAU1234567). Do not get confused between 0 vs O in the 7 digits of container number."
103
+ },
95
104
  "containerSize": {"type": "STRING", "nullable": true,
96
105
  "description": "The size of the container associated with the containerNumber, such as 20ft, 40ft, 40HC, 20DC etc."}
97
106
  }
@@ -53,11 +53,18 @@ Your role is to accurately extract specific entities from these invoices to supp
53
53
  - lineItem: Details of each COGS and Customs line item on the invoice from each page. Make sure to extract each amount and currency separately.
54
54
  - uniqueId: A unique id which associated with the lineItem as each line item can belong to a different shipment. Extract only if its available in the line item. Either a shipmentId starting with an S and followed by 6 or 8 numeric values or a mblNumber. If shipmentId or mblNumber does not exist, set it to containerNumber.
55
55
  - lineItemDescription: The name or description of the item. Usually, it will be a one line sentence.
56
- - unitPrice: Even if the quantity is not mentioned, you can still extract the unit price. Check the naming of the columns in a different languages, it can be "Unit Price", "Prezzo unitario", "Prix Unitaire", "Unitario", etc. Refer to "Prezzo unitario" field in the italian invoice example.
57
- - totalAmount: The total amount for the item. It can be in different currencies, so ensure to capture the currency as well for the totalAmountCurrency.
56
+ - totalAmountCurrency: original/native currency code for the line item's total amount. In 3 letters such as USD, EUR
57
+ - totalAmount: The total amount for the item. It can be in different currencies, so ensure to capture the amount according to the totalAmountCurrency.
58
+ - totalAmountInOriginalCurrency: line item amount in its original/native currency
59
+ - totalAmountInInvoiceCurrency: line item amount converted to the invoice's main currency (the currency used for grandTotal)
58
60
  - totalAmountEuro: Few line items contains a total amount in Euro. You can find it by looking for the term "Total EUR" or "Amount in Euro" in the line item but it's always in the EURO / € currency. Sometimes, it can be same as totalAmount if the line item is already in Euro.
59
- - quantity: The quantity of the item or service provided in the line item. Pay attention to 2 x 40HC or 2x40HC. It means, quantity is 2 and 40HC is containerSize but not 240.
61
+ - quantity: The quantity of the item or service provided in the line item. Pay attention to 2 x 40HC or 2x40HC. It means, quantity is 2 and 40HC is containerSize
62
+ - unitPriceCurrency: The original/native currency code for the unit price. 3 letters
63
+ - unitPrice: Even if the quantity is not mentioned, you can still extract the unit price. Check the naming of the columns in a different languages, it can be "Unit Price", "Prezzo unitario", "Prix Unitaire", "Unitario", etc. Refer to "Prezzo unitario" field in the italian invoice example.
64
+ - vatAmount: The VAT amount applied to the line item. This is the tax charged on the totalAmount of the line item
65
+ - vatPercentage: The percentage rate of VAT applied to the totalAmount of the line item. This is used to calculate the vatAmount
60
66
  - containerNumber: Container Number always starts with 4 letters and is followed by 7 digits (e.g., ABCD1234567, XALU 8593678).
67
+ - containerSize: The size of the container associated with the containerNumber, such as 20ft, 40ft, 40HC, 20DC etc.
61
68
 
62
69
  - hblNumber and mblNumber:
63
70
  - The Master Bill of Lading number. Commonly known as "Bill of Lading Number", "BILL OF LADING NO.", "BL Number", "BL No.", "B/L No.", "BL-Nr.", "B/L", or "HBL No.".