rack-mail_exception 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.document +5 -0
- data/.gitignore +22 -0
- data/LICENSE +20 -0
- data/README.rdoc +38 -0
- data/Rakefile +56 -0
- data/VERSION +1 -0
- data/lib/rack/mail_exception.rb +103 -0
- data/test/helper.rb +13 -0
- data/test/test_rack_mail_exception.rb +93 -0
- data/vendor/mail/.bundle/config +2 -0
- data/vendor/mail/CHANGELOG.rdoc +370 -0
- data/vendor/mail/Dependencies.txt +3 -0
- data/vendor/mail/Gemfile +17 -0
- data/vendor/mail/README.rdoc +572 -0
- data/vendor/mail/ROADMAP +92 -0
- data/vendor/mail/Rakefile +41 -0
- data/vendor/mail/TODO.rdoc +9 -0
- data/vendor/mail/lib/mail.rb +76 -0
- data/vendor/mail/lib/mail/attachments_list.rb +99 -0
- data/vendor/mail/lib/mail/body.rb +287 -0
- data/vendor/mail/lib/mail/configuration.rb +67 -0
- data/vendor/mail/lib/mail/core_extensions/blank.rb +26 -0
- data/vendor/mail/lib/mail/core_extensions/nil.rb +11 -0
- data/vendor/mail/lib/mail/core_extensions/string.rb +27 -0
- data/vendor/mail/lib/mail/elements.rb +14 -0
- data/vendor/mail/lib/mail/elements/address.rb +306 -0
- data/vendor/mail/lib/mail/elements/address_list.rb +74 -0
- data/vendor/mail/lib/mail/elements/content_disposition_element.rb +30 -0
- data/vendor/mail/lib/mail/elements/content_location_element.rb +25 -0
- data/vendor/mail/lib/mail/elements/content_transfer_encoding_element.rb +24 -0
- data/vendor/mail/lib/mail/elements/content_type_element.rb +35 -0
- data/vendor/mail/lib/mail/elements/date_time_element.rb +26 -0
- data/vendor/mail/lib/mail/elements/envelope_from_element.rb +34 -0
- data/vendor/mail/lib/mail/elements/message_ids_element.rb +29 -0
- data/vendor/mail/lib/mail/elements/mime_version_element.rb +26 -0
- data/vendor/mail/lib/mail/elements/phrase_list.rb +21 -0
- data/vendor/mail/lib/mail/elements/received_element.rb +30 -0
- data/vendor/mail/lib/mail/encodings.rb +258 -0
- data/vendor/mail/lib/mail/encodings/7bit.rb +31 -0
- data/vendor/mail/lib/mail/encodings/8bit.rb +31 -0
- data/vendor/mail/lib/mail/encodings/base64.rb +33 -0
- data/vendor/mail/lib/mail/encodings/binary.rb +31 -0
- data/vendor/mail/lib/mail/encodings/quoted_printable.rb +38 -0
- data/vendor/mail/lib/mail/encodings/transfer_encoding.rb +58 -0
- data/vendor/mail/lib/mail/envelope.rb +35 -0
- data/vendor/mail/lib/mail/field.rb +223 -0
- data/vendor/mail/lib/mail/field_list.rb +33 -0
- data/vendor/mail/lib/mail/fields.rb +35 -0
- data/vendor/mail/lib/mail/fields/bcc_field.rb +56 -0
- data/vendor/mail/lib/mail/fields/cc_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/comments_field.rb +41 -0
- data/vendor/mail/lib/mail/fields/common/address_container.rb +16 -0
- data/vendor/mail/lib/mail/fields/common/common_address.rb +125 -0
- data/vendor/mail/lib/mail/fields/common/common_date.rb +42 -0
- data/vendor/mail/lib/mail/fields/common/common_field.rb +50 -0
- data/vendor/mail/lib/mail/fields/common/common_message_id.rb +43 -0
- data/vendor/mail/lib/mail/fields/common/parameter_hash.rb +52 -0
- data/vendor/mail/lib/mail/fields/content_description_field.rb +19 -0
- data/vendor/mail/lib/mail/fields/content_disposition_field.rb +69 -0
- data/vendor/mail/lib/mail/fields/content_id_field.rb +63 -0
- data/vendor/mail/lib/mail/fields/content_location_field.rb +42 -0
- data/vendor/mail/lib/mail/fields/content_transfer_encoding_field.rb +50 -0
- data/vendor/mail/lib/mail/fields/content_type_field.rb +185 -0
- data/vendor/mail/lib/mail/fields/date_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/from_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/in_reply_to_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/keywords_field.rb +44 -0
- data/vendor/mail/lib/mail/fields/message_id_field.rb +83 -0
- data/vendor/mail/lib/mail/fields/mime_version_field.rb +53 -0
- data/vendor/mail/lib/mail/fields/optional_field.rb +13 -0
- data/vendor/mail/lib/mail/fields/received_field.rb +67 -0
- data/vendor/mail/lib/mail/fields/references_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/reply_to_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/resent_bcc_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/resent_cc_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/resent_date_field.rb +35 -0
- data/vendor/mail/lib/mail/fields/resent_from_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/resent_message_id_field.rb +34 -0
- data/vendor/mail/lib/mail/fields/resent_sender_field.rb +62 -0
- data/vendor/mail/lib/mail/fields/resent_to_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/return_path_field.rb +64 -0
- data/vendor/mail/lib/mail/fields/sender_field.rb +67 -0
- data/vendor/mail/lib/mail/fields/structured_field.rb +51 -0
- data/vendor/mail/lib/mail/fields/subject_field.rb +16 -0
- data/vendor/mail/lib/mail/fields/to_field.rb +55 -0
- data/vendor/mail/lib/mail/fields/unstructured_field.rb +166 -0
- data/vendor/mail/lib/mail/header.rb +262 -0
- data/vendor/mail/lib/mail/mail.rb +234 -0
- data/vendor/mail/lib/mail/message.rb +1867 -0
- data/vendor/mail/lib/mail/network.rb +9 -0
- data/vendor/mail/lib/mail/network/delivery_methods/file_delivery.rb +40 -0
- data/vendor/mail/lib/mail/network/delivery_methods/sendmail.rb +62 -0
- data/vendor/mail/lib/mail/network/delivery_methods/smtp.rb +110 -0
- data/vendor/mail/lib/mail/network/delivery_methods/test_mailer.rb +40 -0
- data/vendor/mail/lib/mail/network/retriever_methods/imap.rb +18 -0
- data/vendor/mail/lib/mail/network/retriever_methods/pop3.rb +149 -0
- data/vendor/mail/lib/mail/parsers/address_lists.rb +64 -0
- data/vendor/mail/lib/mail/parsers/address_lists.treetop +19 -0
- data/vendor/mail/lib/mail/parsers/content_disposition.rb +387 -0
- data/vendor/mail/lib/mail/parsers/content_disposition.treetop +46 -0
- data/vendor/mail/lib/mail/parsers/content_location.rb +139 -0
- data/vendor/mail/lib/mail/parsers/content_location.treetop +20 -0
- data/vendor/mail/lib/mail/parsers/content_transfer_encoding.rb +162 -0
- data/vendor/mail/lib/mail/parsers/content_transfer_encoding.treetop +20 -0
- data/vendor/mail/lib/mail/parsers/content_type.rb +539 -0
- data/vendor/mail/lib/mail/parsers/content_type.treetop +58 -0
- data/vendor/mail/lib/mail/parsers/date_time.rb +114 -0
- data/vendor/mail/lib/mail/parsers/date_time.treetop +11 -0
- data/vendor/mail/lib/mail/parsers/envelope_from.rb +194 -0
- data/vendor/mail/lib/mail/parsers/envelope_from.treetop +32 -0
- data/vendor/mail/lib/mail/parsers/message_ids.rb +45 -0
- data/vendor/mail/lib/mail/parsers/message_ids.treetop +15 -0
- data/vendor/mail/lib/mail/parsers/mime_version.rb +144 -0
- data/vendor/mail/lib/mail/parsers/mime_version.treetop +19 -0
- data/vendor/mail/lib/mail/parsers/phrase_lists.rb +45 -0
- data/vendor/mail/lib/mail/parsers/phrase_lists.treetop +15 -0
- data/vendor/mail/lib/mail/parsers/received.rb +71 -0
- data/vendor/mail/lib/mail/parsers/received.treetop +11 -0
- data/vendor/mail/lib/mail/parsers/rfc2045.rb +464 -0
- data/vendor/mail/lib/mail/parsers/rfc2045.treetop +36 -0
- data/vendor/mail/lib/mail/parsers/rfc2822.rb +5318 -0
- data/vendor/mail/lib/mail/parsers/rfc2822.treetop +410 -0
- data/vendor/mail/lib/mail/parsers/rfc2822_obsolete.rb +3757 -0
- data/vendor/mail/lib/mail/parsers/rfc2822_obsolete.treetop +241 -0
- data/vendor/mail/lib/mail/part.rb +102 -0
- data/vendor/mail/lib/mail/parts_list.rb +34 -0
- data/vendor/mail/lib/mail/patterns.rb +30 -0
- data/vendor/mail/lib/mail/utilities.rb +181 -0
- data/vendor/mail/lib/mail/version.rb +10 -0
- data/vendor/mail/lib/mail/version_specific/ruby_1_8.rb +97 -0
- data/vendor/mail/lib/mail/version_specific/ruby_1_9.rb +87 -0
- data/vendor/mail/lib/tasks/corpus.rake +125 -0
- data/vendor/mail/lib/tasks/treetop.rake +10 -0
- data/vendor/mail/mail.gemspec +20 -0
- data/vendor/mail/reference/US ASCII Table.txt +130 -0
- data/vendor/mail/reference/rfc1035 Domain Implementation and Specification.txt +3083 -0
- data/vendor/mail/reference/rfc1049 Content-Type Header Field for Internet Messages.txt +451 -0
- data/vendor/mail/reference/rfc1344 Implications of MIME for Internet Mail Gateways.txt +586 -0
- data/vendor/mail/reference/rfc1345 Character Mnemonics & Character Sets.txt +5761 -0
- data/vendor/mail/reference/rfc1524 A User Agent Configuration Mechanism For Multimedia Mail Format Information.txt +675 -0
- data/vendor/mail/reference/rfc1652 SMTP Service Extension for 8bit-MIMEtransport.txt +339 -0
- data/vendor/mail/reference/rfc1892 Multipart Report .txt +227 -0
- data/vendor/mail/reference/rfc1893 Mail System Status Codes.txt +843 -0
- data/vendor/mail/reference/rfc2045 Multipurpose Internet Mail Extensions (1).txt +1739 -0
- data/vendor/mail/reference/rfc2046 Multipurpose Internet Mail Extensions (2).txt +2467 -0
- data/vendor/mail/reference/rfc2047 Multipurpose Internet Mail Extensions (3).txt +843 -0
- data/vendor/mail/reference/rfc2048 Multipurpose Internet Mail Extensions (4).txt +1180 -0
- data/vendor/mail/reference/rfc2049 Multipurpose Internet Mail Extensions (5).txt +1347 -0
- data/vendor/mail/reference/rfc2111 Content-ID and Message-ID URLs.txt +283 -0
- data/vendor/mail/reference/rfc2183 Content-Disposition Header Field.txt +675 -0
- data/vendor/mail/reference/rfc2231 MIME Parameter Value and Encoded Word Extensions.txt +563 -0
- data/vendor/mail/reference/rfc2387 MIME Multipart-Related Content-type.txt +563 -0
- data/vendor/mail/reference/rfc2821 Simple Mail Transfer Protocol.txt +3711 -0
- data/vendor/mail/reference/rfc2822 Internet Message Format.txt +2859 -0
- data/vendor/mail/reference/rfc3462 Reporting of Mail System Administrative Messages.txt +396 -0
- data/vendor/mail/reference/rfc3696 Checking and Transformation of Names.txt +898 -0
- data/vendor/mail/reference/rfc4155 The application-mbox Media Type.txt +502 -0
- data/vendor/mail/reference/rfc4234 Augmented BNF for Syntax Specifications: ABNF.txt +899 -0
- data/vendor/mail/reference/rfc822 Standard for the Format of ARPA Internet Text Messages.txt +2900 -0
- data/vendor/mail/spec/environment.rb +15 -0
- data/vendor/mail/spec/features/making_a_new_message.feature +14 -0
- data/vendor/mail/spec/features/steps/env.rb +6 -0
- data/vendor/mail/spec/features/steps/making_a_new_message_steps.rb +11 -0
- data/vendor/mail/spec/fixtures/attachments/basic_email.eml +31 -0
- data/vendor/mail/spec/fixtures/attachments/test.gif +0 -0
- data/vendor/mail/spec/fixtures/attachments/test.jpg +0 -0
- data/vendor/mail/spec/fixtures/attachments/test.pdf +0 -0
- data/vendor/mail/spec/fixtures/attachments/test.png +0 -0
- data/vendor/mail/spec/fixtures/attachments/test.tiff +0 -0
- data/vendor/mail/spec/fixtures/attachments/test.zip +0 -0
- data/vendor/mail/spec/fixtures/attachments//343/201/246/343/201/231/343/201/250.txt +2 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_content_disposition.eml +29 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_content_location.eml +32 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_message_rfc822.eml +92 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_only_email.eml +17 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_pdf.eml +70 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_with_encoded_name.eml +47 -0
- data/vendor/mail/spec/fixtures/emails/attachment_emails/attachment_with_quoted_filename.eml +60 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/cant_parse_from.eml +33 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_7-bit.eml +231 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_empty.eml +33 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_plain.eml +148 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_qp_with_space.eml +53 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_spam.eml +44 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_text-html.eml +50 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_with_8bits.eml +770 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_with_semi_colon.eml +269 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/content_transfer_encoding_x_uuencode.eml +79 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/empty_group_lists.eml +162 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/header_fields_with_empty_values.eml +33 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/missing_body.eml +16 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/missing_content_disposition.eml +43 -0
- data/vendor/mail/spec/fixtures/emails/error_emails/multiple_content_types.eml +25 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email11.eml +34 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email12.eml +32 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email2.eml +114 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email4.eml +59 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email7.eml +66 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_encoded_stack_level_too_deep.eml +53 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_with_illegal_boundary.eml +58 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_with_mimepart_without_content_type.eml +94 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_with_multipart_mixed_quoted_boundary.eml +50 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_with_nested_attachment.eml +100 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/raw_email_with_quoted_illegal_boundary.eml +58 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/sig_only_email.eml +29 -0
- data/vendor/mail/spec/fixtures/emails/mime_emails/two_from_in_message.eml +42 -0
- data/vendor/mail/spec/fixtures/emails/multi_charset/japanese.eml +9 -0
- data/vendor/mail/spec/fixtures/emails/multi_charset/japanese_attachment.eml +27 -0
- data/vendor/mail/spec/fixtures/emails/multi_charset/japanese_attachment_long_name.eml +44 -0
- data/vendor/mail/spec/fixtures/emails/multipart_report_emails/multi_address_bounce1.eml +179 -0
- data/vendor/mail/spec/fixtures/emails/multipart_report_emails/multi_address_bounce2.eml +179 -0
- data/vendor/mail/spec/fixtures/emails/multipart_report_emails/report_422.eml +98 -0
- data/vendor/mail/spec/fixtures/emails/multipart_report_emails/report_530.eml +97 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/basic_email.eml +31 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email.eml +14 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email10.eml +20 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email5.eml +19 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email6.eml +20 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email8.eml +47 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_bad_time.eml +62 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_double_at_in_header.eml +14 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_incorrect_header.eml +28 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_multiple_from.eml +30 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_quoted_with_0d0a.eml +14 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_reply.eml +32 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_simple.eml +11 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_string_in_date_field.eml +17 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_trailing_dot.eml +21 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_with_bad_date.eml +48 -0
- data/vendor/mail/spec/fixtures/emails/plain_emails/raw_email_with_partially_quoted_subject.eml +14 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example01.eml +8 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example02.eml +9 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example03.eml +7 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example04.eml +7 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example05.eml +8 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example06.eml +10 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example07.eml +9 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example08.eml +12 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example09.eml +15 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example10.eml +15 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example11.eml +6 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example12.eml +8 -0
- data/vendor/mail/spec/fixtures/emails/rfc2822/example13.eml +10 -0
- data/vendor/mail/spec/fixtures/emails/sample_output_multipart +0 -0
- data/vendor/mail/spec/mail/attachments_list_spec.rb +214 -0
- data/vendor/mail/spec/mail/body_spec.rb +385 -0
- data/vendor/mail/spec/mail/configuration_spec.rb +19 -0
- data/vendor/mail/spec/mail/core_extensions/string_spec.rb +62 -0
- data/vendor/mail/spec/mail/core_extensions_spec.rb +99 -0
- data/vendor/mail/spec/mail/elements/address_list_spec.rb +109 -0
- data/vendor/mail/spec/mail/elements/address_spec.rb +609 -0
- data/vendor/mail/spec/mail/elements/date_time_element_spec.rb +20 -0
- data/vendor/mail/spec/mail/elements/envelope_from_element_spec.rb +31 -0
- data/vendor/mail/spec/mail/elements/message_ids_element_spec.rb +43 -0
- data/vendor/mail/spec/mail/elements/phrase_list_spec.rb +22 -0
- data/vendor/mail/spec/mail/elements/received_element_spec.rb +34 -0
- data/vendor/mail/spec/mail/encoding_spec.rb +189 -0
- data/vendor/mail/spec/mail/encodings/base64_spec.rb +25 -0
- data/vendor/mail/spec/mail/encodings/quoted_printable_spec.rb +25 -0
- data/vendor/mail/spec/mail/encodings_spec.rb +664 -0
- data/vendor/mail/spec/mail/example_emails_spec.rb +303 -0
- data/vendor/mail/spec/mail/field_list_spec.rb +33 -0
- data/vendor/mail/spec/mail/field_spec.rb +198 -0
- data/vendor/mail/spec/mail/fields/bcc_field_spec.rb +89 -0
- data/vendor/mail/spec/mail/fields/cc_field_spec.rb +79 -0
- data/vendor/mail/spec/mail/fields/comments_field_spec.rb +25 -0
- data/vendor/mail/spec/mail/fields/common/address_container_spec.rb +18 -0
- data/vendor/mail/spec/mail/fields/common/common_address_spec.rb +132 -0
- data/vendor/mail/spec/mail/fields/common/common_date_spec.rb +25 -0
- data/vendor/mail/spec/mail/fields/common/common_field_spec.rb +69 -0
- data/vendor/mail/spec/mail/fields/common/common_message_id_spec.rb +30 -0
- data/vendor/mail/spec/mail/fields/common/parameter_hash_spec.rb +56 -0
- data/vendor/mail/spec/mail/fields/content_description_field_spec.rb +39 -0
- data/vendor/mail/spec/mail/fields/content_disposition_field_spec.rb +55 -0
- data/vendor/mail/spec/mail/fields/content_id_field_spec.rb +117 -0
- data/vendor/mail/spec/mail/fields/content_location_field_spec.rb +46 -0
- data/vendor/mail/spec/mail/fields/content_transfer_encoding_field_spec.rb +113 -0
- data/vendor/mail/spec/mail/fields/content_type_field_spec.rb +678 -0
- data/vendor/mail/spec/mail/fields/date_field_spec.rb +73 -0
- data/vendor/mail/spec/mail/fields/envelope_spec.rb +48 -0
- data/vendor/mail/spec/mail/fields/from_field_spec.rb +89 -0
- data/vendor/mail/spec/mail/fields/in_reply_to_field_spec.rb +62 -0
- data/vendor/mail/spec/mail/fields/keywords_field_spec.rb +66 -0
- data/vendor/mail/spec/mail/fields/message_id_field_spec.rb +147 -0
- data/vendor/mail/spec/mail/fields/mime_version_field_spec.rb +166 -0
- data/vendor/mail/spec/mail/fields/received_field_spec.rb +44 -0
- data/vendor/mail/spec/mail/fields/references_field_spec.rb +35 -0
- data/vendor/mail/spec/mail/fields/reply_to_field_spec.rb +67 -0
- data/vendor/mail/spec/mail/fields/resent_bcc_field_spec.rb +66 -0
- data/vendor/mail/spec/mail/fields/resent_cc_field_spec.rb +66 -0
- data/vendor/mail/spec/mail/fields/resent_date_field_spec.rb +39 -0
- data/vendor/mail/spec/mail/fields/resent_from_field_spec.rb +66 -0
- data/vendor/mail/spec/mail/fields/resent_message_id_field_spec.rb +24 -0
- data/vendor/mail/spec/mail/fields/resent_sender_field_spec.rb +58 -0
- data/vendor/mail/spec/mail/fields/resent_to_field_spec.rb +66 -0
- data/vendor/mail/spec/mail/fields/return_path_field_spec.rb +52 -0
- data/vendor/mail/spec/mail/fields/sender_field_spec.rb +58 -0
- data/vendor/mail/spec/mail/fields/structured_field_spec.rb +72 -0
- data/vendor/mail/spec/mail/fields/to_field_spec.rb +92 -0
- data/vendor/mail/spec/mail/fields/unstructured_field_spec.rb +134 -0
- data/vendor/mail/spec/mail/header_spec.rb +578 -0
- data/vendor/mail/spec/mail/mail_spec.rb +34 -0
- data/vendor/mail/spec/mail/message_spec.rb +1409 -0
- data/vendor/mail/spec/mail/mime_messages_spec.rb +435 -0
- data/vendor/mail/spec/mail/multipart_report_spec.rb +112 -0
- data/vendor/mail/spec/mail/network/delivery_methods/file_delivery_spec.rb +79 -0
- data/vendor/mail/spec/mail/network/delivery_methods/sendmail_spec.rb +125 -0
- data/vendor/mail/spec/mail/network/delivery_methods/smtp_spec.rb +133 -0
- data/vendor/mail/spec/mail/network/delivery_methods/test_mailer_spec.rb +57 -0
- data/vendor/mail/spec/mail/network/retriever_methods/pop3_spec.rb +180 -0
- data/vendor/mail/spec/mail/network_spec.rb +359 -0
- data/vendor/mail/spec/mail/parsers/address_lists_parser_spec.rb +15 -0
- data/vendor/mail/spec/mail/parsers/content_transfer_encoding_parser_spec.rb +72 -0
- data/vendor/mail/spec/mail/part_spec.rb +129 -0
- data/vendor/mail/spec/mail/parts_list_spec.rb +12 -0
- data/vendor/mail/spec/mail/round_tripping_spec.rb +44 -0
- data/vendor/mail/spec/mail/utilities_spec.rb +327 -0
- data/vendor/mail/spec/mail/version_specific/escape_paren_1_8_spec.rb +32 -0
- data/vendor/mail/spec/matchers/break_down_to.rb +35 -0
- data/vendor/mail/spec/spec_helper.rb +163 -0
- metadata +442 -0
@@ -0,0 +1,396 @@
|
|
1
|
+
|
2
|
+
|
3
|
+
|
4
|
+
|
5
|
+
|
6
|
+
Network Working Group G. Vaudreuil
|
7
|
+
Request for Comments: 3462 Lucent Technologies
|
8
|
+
Obsoletes: 1892 January 2003
|
9
|
+
Category: Standards Track
|
10
|
+
|
11
|
+
|
12
|
+
The Multipart/Report Content Type
|
13
|
+
for the Reporting of
|
14
|
+
Mail System Administrative Messages
|
15
|
+
|
16
|
+
Status of this Memo
|
17
|
+
|
18
|
+
This document specifies an Internet standards track protocol for the
|
19
|
+
Internet community, and requests discussion and suggestions for
|
20
|
+
improvements. Please refer to the current edition of the "Internet
|
21
|
+
Official Protocol Standards" (STD 1) for the standardization state
|
22
|
+
and status of this protocol. Distribution of this memo is unlimited.
|
23
|
+
|
24
|
+
Copyright Notice
|
25
|
+
|
26
|
+
Copyright (C) The Internet Society (2003). All Rights Reserved.
|
27
|
+
|
28
|
+
Abstract
|
29
|
+
|
30
|
+
The Multipart/Report Multipurpose Internet Mail Extensions (MIME)
|
31
|
+
content-type is a general "family" or "container" type for electronic
|
32
|
+
mail reports of any kind. Although this memo defines only the use of
|
33
|
+
the Multipart/Report content-type with respect to delivery status
|
34
|
+
reports, mail processing programs will benefit if a single content-
|
35
|
+
type is used to for all kinds of reports.
|
36
|
+
|
37
|
+
This document is part of a four document set describing the delivery
|
38
|
+
status report service. This collection includes the Simple Mail
|
39
|
+
Transfer Protocol (SMTP) extensions to request delivery status
|
40
|
+
reports, a MIME content for the reporting of delivery reports, an
|
41
|
+
enumeration of extended status codes, and a multipart container for
|
42
|
+
the delivery report, the original message, and a human-friendly
|
43
|
+
summary of the failure.
|
44
|
+
|
45
|
+
|
46
|
+
|
47
|
+
|
48
|
+
|
49
|
+
|
50
|
+
|
51
|
+
|
52
|
+
|
53
|
+
|
54
|
+
|
55
|
+
|
56
|
+
|
57
|
+
Vaudreuil Standards Track [Page 1]
|
58
|
+
|
59
|
+
RFC 3462 Multipart/Report January 2003
|
60
|
+
|
61
|
+
|
62
|
+
Table of Contents
|
63
|
+
|
64
|
+
Document Conventions................................................2
|
65
|
+
1. The Multipart/Report Content Type................................2
|
66
|
+
2. The Text/RFC822-Headers..........................................4
|
67
|
+
3. Security Considerations..........................................4
|
68
|
+
4. Normative References.............................................5
|
69
|
+
Appendix A - Changes from RFC 1893..................................6
|
70
|
+
Author's Address....................................................6
|
71
|
+
Full Copyright Statement............................................7
|
72
|
+
|
73
|
+
Document Conventions
|
74
|
+
|
75
|
+
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
76
|
+
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
77
|
+
document are to be interpreted as described in BCP 14, RFC 2119
|
78
|
+
[RFC2119].
|
79
|
+
|
80
|
+
1. The Multipart/Report Content Type
|
81
|
+
|
82
|
+
The Multipart/Report MIME content-type is a general "family" or
|
83
|
+
"container" type for electronic mail reports of any kind. Although
|
84
|
+
this memo defines only the use of the Multipart/Report content-type
|
85
|
+
with respect to delivery status reports, mail processing programs
|
86
|
+
will benefit if a single content-type is used to for all kinds of
|
87
|
+
reports.
|
88
|
+
|
89
|
+
The Multipart/Report content-type is defined as follows:
|
90
|
+
|
91
|
+
MIME type name: multipart
|
92
|
+
MIME subtype name: report
|
93
|
+
Required parameters: boundary, report-type
|
94
|
+
Optional parameters: none
|
95
|
+
Encoding considerations: 7bit should always be adequate
|
96
|
+
Security considerations: see section 3 of this memo
|
97
|
+
|
98
|
+
The syntax of Multipart/Report is identical to the Multipart/Mixed
|
99
|
+
content type defined in [MIME]. When used to send a report, the
|
100
|
+
Multipart/Report content-type must be the top-level MIME content type
|
101
|
+
for any report message. The report-type parameter identifies the
|
102
|
+
type of report. The parameter is the MIME content sub-type of the
|
103
|
+
second body part of the Multipart/Report.
|
104
|
+
|
105
|
+
User agents and gateways must be able to automatically determine that
|
106
|
+
a message is a mail system report and should be processed as such.
|
107
|
+
Placing the Multipart/Report as the outermost content provides a
|
108
|
+
mechanism whereby an auto-processor may detect through parsing the
|
109
|
+
RFC 822 headers that the message is a report.
|
110
|
+
|
111
|
+
|
112
|
+
|
113
|
+
Vaudreuil Standards Track [Page 2]
|
114
|
+
|
115
|
+
RFC 3462 Multipart/Report January 2003
|
116
|
+
|
117
|
+
|
118
|
+
The Multipart/Report content-type contains either two or three sub-
|
119
|
+
parts, in the following order:
|
120
|
+
|
121
|
+
1) [Required] The first body part contains human readable message.
|
122
|
+
The purpose of this message is to provide an easily understood
|
123
|
+
description of the condition(s) that caused the report to be
|
124
|
+
generated, for a human reader who may not have a user agent capable
|
125
|
+
of interpreting the second section of the Multipart/Report.
|
126
|
+
|
127
|
+
The text in the first section may be in any MIME standards-track
|
128
|
+
content-type, charset, or language. Where a description of the error
|
129
|
+
is desired in several languages or several media, a
|
130
|
+
Multipart/Alternative construct may be used.
|
131
|
+
|
132
|
+
This body part may also be used to send detailed information that
|
133
|
+
cannot be easily formatted into a Message/Report body part.
|
134
|
+
|
135
|
+
(2) [Required] A machine parsable body part containing an account of
|
136
|
+
the reported message handling event. The purpose of this body part is
|
137
|
+
to provide a machine-readable description of the condition(s) that
|
138
|
+
caused the report to be generated, along with details not present in
|
139
|
+
the first body part that may be useful to human experts. An initial
|
140
|
+
body part, Message/delivery-status is defined in [DSN].
|
141
|
+
|
142
|
+
(3) [Optional] A body part containing the returned message or a
|
143
|
+
portion thereof. This information may be useful to aid human experts
|
144
|
+
in diagnosing problems. (Although it may also be useful to allow the
|
145
|
+
sender to identify the message which the report was issued, it is
|
146
|
+
hoped that the envelope-id and original-recipient-address returned in
|
147
|
+
the Message/Report body part will replace the traditional use of the
|
148
|
+
returned content for this purpose.)
|
149
|
+
|
150
|
+
Return of content may be wasteful of network bandwidth and a variety
|
151
|
+
of implementation strategies can be used. Generally the sender
|
152
|
+
should choose the appropriate strategy and inform the recipient of
|
153
|
+
the required level of returned content required. In the absence of
|
154
|
+
an explicit request for level of return of content such as that
|
155
|
+
provided in [DRPT], the agent that generated the delivery service
|
156
|
+
report should return the full message content.
|
157
|
+
|
158
|
+
When 8-bit or binary data not encoded in a 7 bit form is to be
|
159
|
+
returned, and the return path is not guaranteed to be 8-bit or binary
|
160
|
+
capable, two options are available. The original message MAY be re-
|
161
|
+
encoded into a legal 7-bit MIME message or the Text/RFC822-Headers
|
162
|
+
content-type MAY be used to return only the original message headers.
|
163
|
+
|
164
|
+
|
165
|
+
|
166
|
+
|
167
|
+
|
168
|
+
|
169
|
+
Vaudreuil Standards Track [Page 3]
|
170
|
+
|
171
|
+
RFC 3462 Multipart/Report January 2003
|
172
|
+
|
173
|
+
|
174
|
+
2. The Text/RFC822-Headers content-type
|
175
|
+
|
176
|
+
The Text/RFC822-Headers MIME content-type provides a mechanism to
|
177
|
+
label and return only the RFC 822 headers of a failed message. These
|
178
|
+
headers are not the complete message and should not be returned as a
|
179
|
+
Message/RFC822. The returned headers are useful for identifying the
|
180
|
+
failed message and for diagnostics based on the received lines.
|
181
|
+
|
182
|
+
The Text/RFC822-Headers content-type is defined as follows:
|
183
|
+
|
184
|
+
MIME type name: Text
|
185
|
+
MIME subtype name: RFC822-Headers
|
186
|
+
Required parameters: None
|
187
|
+
Optional parameters: None
|
188
|
+
Encoding considerations: 7 bit is sufficient for normal RFC822
|
189
|
+
headers, however, if the headers are broken and require
|
190
|
+
encoding to make them legal 7 bit content, they may be
|
191
|
+
encoded in quoted-printable.
|
192
|
+
Security considerations: See section 3 of this memo.
|
193
|
+
|
194
|
+
The Text/RFC822-Headers body part should contain all the RFC822
|
195
|
+
header lines from the message which caused the report. The RFC822
|
196
|
+
headers include all lines prior to the blank line in the message.
|
197
|
+
They include the MIME-Version and MIME Content-Headers.
|
198
|
+
|
199
|
+
3. Security Considerations
|
200
|
+
|
201
|
+
Automated use of report types without authentication presents several
|
202
|
+
security issues. Forging negative reports presents the opportunity
|
203
|
+
for denial-of-service attacks when the reports are used for automated
|
204
|
+
maintenance of directories or mailing lists. Forging positive
|
205
|
+
reports may cause the sender to incorrectly believe a message was
|
206
|
+
delivered when it was not.
|
207
|
+
|
208
|
+
A signature covering the entire multipart/report structure could be
|
209
|
+
used to prevent such forgeries; such a signature scheme is, however,
|
210
|
+
beyond the scope of this document.
|
211
|
+
|
212
|
+
|
213
|
+
|
214
|
+
|
215
|
+
|
216
|
+
|
217
|
+
|
218
|
+
|
219
|
+
|
220
|
+
|
221
|
+
|
222
|
+
|
223
|
+
|
224
|
+
|
225
|
+
Vaudreuil Standards Track [Page 4]
|
226
|
+
|
227
|
+
RFC 3462 Multipart/Report January 2003
|
228
|
+
|
229
|
+
|
230
|
+
4. Normative References
|
231
|
+
|
232
|
+
[SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
|
233
|
+
821, August 1982.
|
234
|
+
|
235
|
+
[DSN] Moore, K., and G. Vaudreuil, "An Extensible Message Format
|
236
|
+
for Delivery Status Notifications", RFC 3464, January
|
237
|
+
2003.
|
238
|
+
|
239
|
+
[RFC822] Crocker, D., "Standard for the format of ARPA Internet
|
240
|
+
Text Messages", STD 11, RFC 822, August 1982.
|
241
|
+
|
242
|
+
[MIME] Borenstein, N. and N. Freed, "Multipurpose Internet Mail
|
243
|
+
Extensions (MIME) Part Two: Media Types", RFC 2046,
|
244
|
+
November 1996.
|
245
|
+
|
246
|
+
[DRPT] Moore, K., "SMTP Service Extension for Delivery Status
|
247
|
+
Notifications", RFC 3461, January 2003.
|
248
|
+
|
249
|
+
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
|
250
|
+
Requirement Levels", BCP 14, RFC 2119, March 1997.
|
251
|
+
|
252
|
+
|
253
|
+
|
254
|
+
|
255
|
+
|
256
|
+
|
257
|
+
|
258
|
+
|
259
|
+
|
260
|
+
|
261
|
+
|
262
|
+
|
263
|
+
|
264
|
+
|
265
|
+
|
266
|
+
|
267
|
+
|
268
|
+
|
269
|
+
|
270
|
+
|
271
|
+
|
272
|
+
|
273
|
+
|
274
|
+
|
275
|
+
|
276
|
+
|
277
|
+
|
278
|
+
|
279
|
+
|
280
|
+
|
281
|
+
Vaudreuil Standards Track [Page 5]
|
282
|
+
|
283
|
+
RFC 3462 Multipart/Report January 2003
|
284
|
+
|
285
|
+
|
286
|
+
Appendix A - Changes from RFC 1892
|
287
|
+
|
288
|
+
Changed Authors contact information
|
289
|
+
|
290
|
+
Updated required standards boilerplate
|
291
|
+
|
292
|
+
Edited the text to make it spell-checker and grammar checker
|
293
|
+
compliant
|
294
|
+
|
295
|
+
Author's Address
|
296
|
+
|
297
|
+
Gregory M. Vaudreuil
|
298
|
+
Lucent Technologies
|
299
|
+
7291 Williamson Rd
|
300
|
+
Dallas Tx, 75214
|
301
|
+
|
302
|
+
Phone: +1 214 823 9325
|
303
|
+
EMail: GregV@ieee.org
|
304
|
+
|
305
|
+
|
306
|
+
|
307
|
+
|
308
|
+
|
309
|
+
|
310
|
+
|
311
|
+
|
312
|
+
|
313
|
+
|
314
|
+
|
315
|
+
|
316
|
+
|
317
|
+
|
318
|
+
|
319
|
+
|
320
|
+
|
321
|
+
|
322
|
+
|
323
|
+
|
324
|
+
|
325
|
+
|
326
|
+
|
327
|
+
|
328
|
+
|
329
|
+
|
330
|
+
|
331
|
+
|
332
|
+
|
333
|
+
|
334
|
+
|
335
|
+
|
336
|
+
|
337
|
+
Vaudreuil Standards Track [Page 6]
|
338
|
+
|
339
|
+
RFC 3462 Multipart/Report January 2003
|
340
|
+
|
341
|
+
|
342
|
+
Full Copyright Statement
|
343
|
+
|
344
|
+
Copyright (C) The Internet Society (2003). All Rights Reserved.
|
345
|
+
|
346
|
+
This document and translations of it may be copied and furnished to
|
347
|
+
others, and derivative works that comment on or otherwise explain it
|
348
|
+
or assist in its implementation may be prepared, copied, published
|
349
|
+
and distributed, in whole or in part, without restriction of any
|
350
|
+
kind, provided that the above copyright notice and this paragraph are
|
351
|
+
included on all such copies and derivative works. However, this
|
352
|
+
document itself may not be modified in any way, such as by removing
|
353
|
+
the copyright notice or references to the Internet Society or other
|
354
|
+
Internet organizations, except as needed for the purpose of
|
355
|
+
developing Internet standards in which case the procedures for
|
356
|
+
copyrights defined in the Internet Standards process must be
|
357
|
+
followed, or as required to translate it into languages other than
|
358
|
+
English.
|
359
|
+
|
360
|
+
The limited permissions granted above are perpetual and will not be
|
361
|
+
revoked by the Internet Society or its successors or assigns.
|
362
|
+
|
363
|
+
This document and the information contained herein is provided on an
|
364
|
+
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
|
365
|
+
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
|
366
|
+
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
|
367
|
+
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
|
368
|
+
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
369
|
+
|
370
|
+
Acknowledgement
|
371
|
+
|
372
|
+
Funding for the RFC Editor function is currently provided by the
|
373
|
+
Internet Society.
|
374
|
+
|
375
|
+
|
376
|
+
|
377
|
+
|
378
|
+
|
379
|
+
|
380
|
+
|
381
|
+
|
382
|
+
|
383
|
+
|
384
|
+
|
385
|
+
|
386
|
+
|
387
|
+
|
388
|
+
|
389
|
+
|
390
|
+
|
391
|
+
|
392
|
+
|
393
|
+
Vaudreuil Standards Track [Page 7]
|
394
|
+
|
395
|
+
|
396
|
+
|
@@ -0,0 +1,898 @@
|
|
1
|
+
|
2
|
+
INFORMATIONAL
|
3
|
+
Errata
|
4
|
+
Network Working Group J. Klensin
|
5
|
+
Request for Comments: 3696 February 2004
|
6
|
+
Category: Informational
|
7
|
+
|
8
|
+
|
9
|
+
Application Techniques for Checking and Transformation of Names
|
10
|
+
|
11
|
+
Status of this Memo
|
12
|
+
|
13
|
+
This memo provides information for the Internet community. It does
|
14
|
+
not specify an Internet standard of any kind. Distribution of this
|
15
|
+
memo is unlimited.
|
16
|
+
|
17
|
+
Copyright Notice
|
18
|
+
|
19
|
+
Copyright (C) The Internet Society (2004). All Rights Reserved.
|
20
|
+
|
21
|
+
Abstract
|
22
|
+
|
23
|
+
Many Internet applications have been designed to deduce top-level
|
24
|
+
domains (or other domain name labels) from partial information. The
|
25
|
+
introduction of new top-level domains, especially non-country-code
|
26
|
+
ones, has exposed flaws in some of the methods used by these
|
27
|
+
applications. These flaws make it more difficult, or impossible, for
|
28
|
+
users of the applications to access the full Internet. This memo
|
29
|
+
discusses some of the techniques that have been used and gives some
|
30
|
+
guidance for minimizing their negative impact as the domain name
|
31
|
+
environment evolves. This document draws summaries of the applicable
|
32
|
+
rules together in one place and supplies references to the actual
|
33
|
+
standards.
|
34
|
+
|
35
|
+
|
36
|
+
|
37
|
+
|
38
|
+
|
39
|
+
|
40
|
+
|
41
|
+
|
42
|
+
|
43
|
+
|
44
|
+
|
45
|
+
|
46
|
+
|
47
|
+
|
48
|
+
|
49
|
+
|
50
|
+
|
51
|
+
|
52
|
+
|
53
|
+
|
54
|
+
|
55
|
+
Klensin Informational [Page 1]
|
56
|
+
|
57
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
58
|
+
|
59
|
+
|
60
|
+
Table of Contents
|
61
|
+
|
62
|
+
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
63
|
+
2. Restrictions on domain (DNS) names . . . . . . . . . . . . . . 3
|
64
|
+
3. Restrictions on email addresses . . . . . . . . . . . . . . . 5
|
65
|
+
4. URLs and URIs . . . . . . . . . . . . . . . . . . . . . . . . 7
|
66
|
+
4.1. URI syntax definitions and issues . . . . . . . . . . . 7
|
67
|
+
4.2. The HTTP URL . . . . . . . . . . . . . . . . . . . . . . 8
|
68
|
+
4.3. The MAILTO URL . . . . . . . . . . . . . . . . . . . . . 9
|
69
|
+
4.4. Guessing domain names in web contexts . . . . . . . . . 11
|
70
|
+
5. Implications of internationalization . . . . . . . . . . . . . 11
|
71
|
+
6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
|
72
|
+
7. Security Considerations . . . . . . . . . . . . . . . . . . . 13
|
73
|
+
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
|
74
|
+
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
|
75
|
+
9.1. Normative References . . . . . . . . . . . . . . . . . . 14
|
76
|
+
9.2. Informative References . . . . . . . . . . . . . . . . . 15
|
77
|
+
10. Author's Address . . . . . . . . . . . . . . . . . . . . . . . 15
|
78
|
+
11. Full Copyright Statement . . . . . . . . . . . . . . . . . . . 16
|
79
|
+
|
80
|
+
1. Introduction
|
81
|
+
|
82
|
+
Designers of user interfaces to Internet applications have often
|
83
|
+
found it useful to examine user-provided values for validity before
|
84
|
+
passing them to the Internet tools themselves. This type of test,
|
85
|
+
most commonly involving syntax checks or application of other rules
|
86
|
+
to domain names, email addresses, or "web addresses" (URLs or,
|
87
|
+
occasionally, extended URI forms (see Section 4)) may enable better-
|
88
|
+
quality diagnostics for the user than might be available from the
|
89
|
+
protocol itself. Local validity tests on values are also thought to
|
90
|
+
improve the efficiency of back-office processing programs and to
|
91
|
+
reduce the load on the protocols themselves. Certainly, they are
|
92
|
+
consistent with the well-established principle that it is better to
|
93
|
+
detect errors as early as possible.
|
94
|
+
|
95
|
+
The tests must, however, be made correctly or at least safely. If
|
96
|
+
criteria are applied that do not match the protocols, users will be
|
97
|
+
inconvenienced, addresses and sites will effectively become
|
98
|
+
inaccessible to some groups, and business and communications
|
99
|
+
opportunities will be lost. Experience in recent years indicates
|
100
|
+
that syntax tests are often performed incorrectly and that tests for
|
101
|
+
top-level domain names are applied using obsolete lists and
|
102
|
+
conventions. We assume that most of these incorrect tests are the
|
103
|
+
result of the inability to conveniently locate exact definitions for
|
104
|
+
the criteria to be applied. This document draws summaries of the
|
105
|
+
applicable rules together in one place and supplies references to the
|
106
|
+
|
107
|
+
|
108
|
+
|
109
|
+
|
110
|
+
|
111
|
+
Klensin Informational [Page 2]
|
112
|
+
|
113
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
114
|
+
|
115
|
+
|
116
|
+
actual standards. It does not add anything to those standards; it
|
117
|
+
merely draws the information together into a form that may be more
|
118
|
+
accessible.
|
119
|
+
|
120
|
+
Many experts on Internet protocols believe that tests and rules of
|
121
|
+
these sorts should be avoided in applications and that the tests in
|
122
|
+
the protocols and back-office systems should be relied on instead.
|
123
|
+
Certainly implementations of the protocols cannot assume that the
|
124
|
+
data passed to them will be valid. Unless the standards specify
|
125
|
+
particular behavior, this document takes no position on whether or
|
126
|
+
not the testing is desirable. It only identifies the correct tests
|
127
|
+
to be made if tests are to be applied.
|
128
|
+
|
129
|
+
The sections that follow discuss domain names, email addresses, and
|
130
|
+
URLs.
|
131
|
+
|
132
|
+
2. Restrictions on domain (DNS) names
|
133
|
+
|
134
|
+
The authoritative definitions of the format and syntax of domain
|
135
|
+
names appear in RFCs 1035 [RFC1035], 1123 [RFC1123], and 2181
|
136
|
+
[RFC2181].
|
137
|
+
|
138
|
+
Any characters, or combination of bits (as octets), are permitted in
|
139
|
+
DNS names. However, there is a preferred form that is required by
|
140
|
+
most applications. This preferred form has been the only one
|
141
|
+
permitted in the names of top-level domains, or TLDs. In general, it
|
142
|
+
is also the only form permitted in most second-level names registered
|
143
|
+
in TLDs, although some names that are normally not seen by users obey
|
144
|
+
other rules. It derives from the original ARPANET rules for the
|
145
|
+
naming of hosts (i.e., the "hostname" rule) and is perhaps better
|
146
|
+
described as the "LDH rule", after the characters that it permits.
|
147
|
+
The LDH rule, as updated, provides that the labels (words or strings
|
148
|
+
separated by periods) that make up a domain name must consist of only
|
149
|
+
the ASCII [ASCII] alphabetic and numeric characters, plus the hyphen.
|
150
|
+
No other symbols or punctuation characters are permitted, nor is
|
151
|
+
blank space. If the hyphen is used, it is not permitted to appear at
|
152
|
+
either the beginning or end of a label. There is an additional rule
|
153
|
+
that essentially requires that top-level domain names not be all-
|
154
|
+
numeric.
|
155
|
+
|
156
|
+
When it is necessary to express labels with non-character octets, or
|
157
|
+
to embed periods within labels, there is a mechanism for keying them
|
158
|
+
in that utilizes an escape sequence. RFC 1035 [RFC1035] should be
|
159
|
+
consulted if that mechanism is needed (most common applications,
|
160
|
+
including email and the Web, will generally not permit those escaped
|
161
|
+
strings). A special encoding is now available for non-ASCII
|
162
|
+
characters, see the brief discussion in Section 5.
|
163
|
+
|
164
|
+
|
165
|
+
|
166
|
+
|
167
|
+
Klensin Informational [Page 3]
|
168
|
+
|
169
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
170
|
+
|
171
|
+
|
172
|
+
Most internet applications that reference other hosts or systems
|
173
|
+
assume they will be supplied with "fully-qualified" domain names,
|
174
|
+
i.e., ones that include all of the labels leading to the root,
|
175
|
+
including the TLD name. Those fully-qualified domain names are then
|
176
|
+
passed to either the domain name resolution protocol itself or to the
|
177
|
+
remote systems. Consequently, purported DNS names to be used in
|
178
|
+
applications and to locate resources generally must contain at least
|
179
|
+
one period (".") character. Those that do not are either invalid or
|
180
|
+
require the application to supply additional information. Of course,
|
181
|
+
this principle does not apply when the purpose of the application is
|
182
|
+
to process or query TLD names themselves. The DNS specification also
|
183
|
+
permits a trailing period to be used to denote the root, e.g.,
|
184
|
+
"a.b.c" and "a.b.c." are equivalent, but the latter is more explicit
|
185
|
+
and is required to be accepted by applications. This convention is
|
186
|
+
especially important when a TLD name is being referred to directly.
|
187
|
+
For example, while ".COM" has become the popular terminology for
|
188
|
+
referring to that top-level domain, "COM." would be strictly and
|
189
|
+
technically correct in talking about the DNS, since it shows that
|
190
|
+
"COM" is a top-level domain name.
|
191
|
+
|
192
|
+
There is a long history of applications moving beyond the "one or
|
193
|
+
more periods" test in an attempt to verify that a valid TLD name is
|
194
|
+
actually present. They have done this either by applying some
|
195
|
+
heuristics to the form of the name or by consulting a local list of
|
196
|
+
valid names. The historical heuristics are no longer effective. If
|
197
|
+
one is to keep a local list, much more effort must be devoted to
|
198
|
+
keeping it up-to-date than was the case several years ago.
|
199
|
+
|
200
|
+
The heuristics were based on the observation that, since the DNS was
|
201
|
+
first deployed, all top-level domain names were two, three, or four
|
202
|
+
characters in length. All two-character names were associated with
|
203
|
+
"country code" domains, with the specific labels (with a few early
|
204
|
+
exceptions) drawn from the ISO list of codes for countries and
|
205
|
+
similar entities [IS3166]. The three-letter names were "generic"
|
206
|
+
TLDs, whose function was not country-specific, and there was exactly
|
207
|
+
one four-letter TLD, the infrastructure domain "ARPA." [RFC1591].
|
208
|
+
However, these length-dependent rules were conventions, rather than
|
209
|
+
anything on which the protocols depended.
|
210
|
+
|
211
|
+
Before the mid-1990s, lists of valid top-level domain names changed
|
212
|
+
infrequently. New country codes were gradually, and then more
|
213
|
+
rapidly, added as the Internet expanded, but the list of generic
|
214
|
+
domains did not change at all between the establishment of the "INT."
|
215
|
+
domain in 1988 and ICANN's allocation of new generic TLDs in 2000.
|
216
|
+
Some application developers responded by assuming that any two-letter
|
217
|
+
domain name could be valid as a TLD, but the list of generic TLDs was
|
218
|
+
fixed and could be kept locally and tested. Several of these
|
219
|
+
assumptions changed as ICANN started to allocate new top-level
|
220
|
+
|
221
|
+
|
222
|
+
|
223
|
+
Klensin Informational [Page 4]
|
224
|
+
|
225
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
226
|
+
|
227
|
+
|
228
|
+
domains: one two-letter domain that does not appear in the ISO 3166-1
|
229
|
+
table [ISO.3166.1988] was tentatively approved, and new domains were
|
230
|
+
created with three, four, and even six letter codes.
|
231
|
+
|
232
|
+
As of the first quarter of 2003, the list of valid, non-country,
|
233
|
+
top-level domains was .AERO, .BIZ, .COM, .COOP, .EDU, .GOV, .INFO,
|
234
|
+
.INT, .MIL, .MUSEUM, .NAME, .NET, .ORG, .PRO, and .ARPA. ICANN is
|
235
|
+
expected to expand that list at regular intervals, so the list that
|
236
|
+
appears here should not be used in testing. Instead, systems that
|
237
|
+
filter by testing top-level domain names should regularly update
|
238
|
+
their local tables of TLDs (both "generic" and country-code-related)
|
239
|
+
by polling the list published by IANA [DomainList]. It is
|
240
|
+
likely that the better strategy has now become to make the "at least
|
241
|
+
one period" test, to verify LDH conformance (including verification
|
242
|
+
that the apparent TLD name is not all-numeric), and then to use the
|
243
|
+
DNS to determine domain name validity, rather than trying to maintain
|
244
|
+
a local list of valid TLD names.
|
245
|
+
|
246
|
+
A DNS label may be no more than 63 octets long. This is in the form
|
247
|
+
actually stored; if a non-ASCII label is converted to encoded
|
248
|
+
"punycode" form (see Section 5), the length of that form may restrict
|
249
|
+
the number of actual characters (in the original character set) that
|
250
|
+
can be accommodated. A complete, fully-qualified, domain name must
|
251
|
+
not exceed 255 octets.
|
252
|
+
|
253
|
+
Some additional mechanisms for guessing correct domain names when
|
254
|
+
incomplete information is provided have been developed for use with
|
255
|
+
the web and are discussed in Section 4.4.
|
256
|
+
|
257
|
+
3. Restrictions on email addresses
|
258
|
+
|
259
|
+
Reference documents: RFC 2821 [RFC2821] and RFC 2822 [RFC2822]
|
260
|
+
|
261
|
+
Contemporary email addresses consist of a "local part" separated from
|
262
|
+
a "domain part" (a fully-qualified domain name) by an at-sign ("@").
|
263
|
+
The syntax of the domain part corresponds to that in the previous
|
264
|
+
section. The concerns identified in that section about filtering and
|
265
|
+
lists of names apply to the domain names used in an email context as
|
266
|
+
well. The domain name can also be replaced by an IP address in
|
267
|
+
square brackets, but that form is strongly discouraged except for
|
268
|
+
testing and troubleshooting purposes.
|
269
|
+
|
270
|
+
The local part may appear using the quoting conventions described
|
271
|
+
below. The quoted forms are rarely used in practice, but are
|
272
|
+
required for some legitimate purposes. Hence, they should not be
|
273
|
+
rejected in filtering routines but, should instead be passed to the
|
274
|
+
email system for evaluation by the destination host.
|
275
|
+
|
276
|
+
|
277
|
+
|
278
|
+
|
279
|
+
Klensin Informational [Page 5]
|
280
|
+
|
281
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
282
|
+
|
283
|
+
|
284
|
+
The exact rule is that any ASCII character, including control
|
285
|
+
characters, may appear quoted, or in a quoted string. When quoting
|
286
|
+
is needed, the backslash character is used to quote the following
|
287
|
+
character. For example
|
288
|
+
|
289
|
+
Abc\@def@example.com
|
290
|
+
|
291
|
+
is a valid form of an email address. Blank spaces may also appear,
|
292
|
+
as in
|
293
|
+
|
294
|
+
Fred\ Bloggs@example.com
|
295
|
+
|
296
|
+
The backslash character may also be used to quote itself, e.g.,
|
297
|
+
|
298
|
+
Joe.\\Blow@example.com
|
299
|
+
|
300
|
+
In addition to quoting using the backslash character, conventional
|
301
|
+
double-quote characters may be used to surround strings. For example
|
302
|
+
|
303
|
+
"Abc@def"@example.com
|
304
|
+
|
305
|
+
"Fred Bloggs"@example.com
|
306
|
+
|
307
|
+
are alternate forms of the first two examples above. These quoted
|
308
|
+
forms are rarely recommended, and are uncommon in practice, but, as
|
309
|
+
discussed above, must be supported by applications that are
|
310
|
+
processing email addresses. In particular, the quoted forms often
|
311
|
+
appear in the context of addresses associated with transitions from
|
312
|
+
other systems and contexts; those transitional requirements do still
|
313
|
+
arise and, since a system that accepts a user-provided email address
|
314
|
+
cannot "know" whether that address is associated with a legacy
|
315
|
+
system, the address forms must be accepted and passed into the email
|
316
|
+
environment.
|
317
|
+
|
318
|
+
Without quotes, local-parts may consist of any combination of
|
319
|
+
alphabetic characters, digits, or any of the special characters
|
320
|
+
|
321
|
+
! # $ % & ' * + - / = ? ^ _ ` . { | } ~
|
322
|
+
|
323
|
+
period (".") may also appear, but may not be used to start or end the
|
324
|
+
local part, nor may two or more consecutive periods appear. Stated
|
325
|
+
differently, any ASCII graphic (printing) character other than the
|
326
|
+
at-sign ("@"), backslash, double quote, comma, or square brackets may
|
327
|
+
appear without quoting. If any of that list of excluded characters
|
328
|
+
are to appear, they must be quoted. Forms such as
|
329
|
+
|
330
|
+
user+mailbox@example.com
|
331
|
+
|
332
|
+
|
333
|
+
|
334
|
+
|
335
|
+
Klensin Informational [Page 6]
|
336
|
+
|
337
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
338
|
+
|
339
|
+
|
340
|
+
customer/department=shipping@example.com
|
341
|
+
|
342
|
+
$A12345@example.com
|
343
|
+
|
344
|
+
!def!xyz%abc@example.com
|
345
|
+
|
346
|
+
_somename@example.com
|
347
|
+
|
348
|
+
are valid and are seen fairly regularly, but any of the characters
|
349
|
+
listed above are permitted. In the context of local parts,
|
350
|
+
apostrophe ("'") and acute accent ("`") are ordinary characters, not
|
351
|
+
quoting characters. Some of the characters listed above are used in
|
352
|
+
conventions about routing or other types of special handling by some
|
353
|
+
receiving hosts. But, since there is no way to know whether the
|
354
|
+
remote host is using those conventions or just treating these
|
355
|
+
characters as normal text, sending programs (and programs evaluating
|
356
|
+
address validity) must simply accept the strings and pass them on.
|
357
|
+
|
358
|
+
In addition to restrictions on syntax, there is a length limit on
|
359
|
+
email addresses. That limit is a maximum of 64 characters (octets)
|
360
|
+
in the "local part" (before the "@") and a maximum of 255 characters
|
361
|
+
(octets) in the domain part (after the "@") for a total length of 320
|
362
|
+
characters. Systems that handle email should be prepared to process
|
363
|
+
addresses which are that long, even though they are rarely
|
364
|
+
encountered.
|
365
|
+
|
366
|
+
4. URLs and URIs
|
367
|
+
|
368
|
+
4.1. URI syntax definitions and issues
|
369
|
+
|
370
|
+
The syntax for URLs (Uniform Resource Locators) is specified in
|
371
|
+
[RFC1738]. The syntax for the more general "URI" (Uniform Resource
|
372
|
+
Identifier) is specified in [RFC2396]. The URI syntax is extremely
|
373
|
+
general, with considerable variations permitted according to the type
|
374
|
+
of "scheme" (e.g., "http", "ftp", "mailto") that is being used.
|
375
|
+
While it is possible to use the general syntax rules of RFC 2396 to
|
376
|
+
perform syntax checks, they are general enough --essentially only
|
377
|
+
specifying the separation of the scheme name and "scheme specific
|
378
|
+
part" with a colon (":") and excluding some characters that must be
|
379
|
+
escaped if used-- to provide little significant filtering or
|
380
|
+
validation power.
|
381
|
+
|
382
|
+
The following characters are reserved in many URIs -- they must be
|
383
|
+
used for either their URI-intended purpose or must be encoded. Some
|
384
|
+
particular schemes may either broaden or relax these restrictions
|
385
|
+
(see the following sections for URLs applicable to "web pages" and
|
386
|
+
electronic mail), or apply them only to particular URI component
|
387
|
+
parts.
|
388
|
+
|
389
|
+
|
390
|
+
|
391
|
+
Klensin Informational [Page 7]
|
392
|
+
|
393
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
394
|
+
|
395
|
+
|
396
|
+
; / ? : @ & = + $ , ?
|
397
|
+
|
398
|
+
In addition, control characters, the space character, the double-
|
399
|
+
quote (") character, and the following special characters
|
400
|
+
|
401
|
+
< > # %
|
402
|
+
|
403
|
+
are generally forbidden and must either be avoided or escaped, as
|
404
|
+
discussed below.
|
405
|
+
|
406
|
+
The colon after the scheme name, and the percent sign used to escape
|
407
|
+
characters, are specifically reserved for those purposes, although
|
408
|
+
":" may also be used elsewhere in some schemes.
|
409
|
+
|
410
|
+
When it is necessary to encode these, or other, characters, the
|
411
|
+
method used is to replace it with a percent-sign ("%") followed by
|
412
|
+
two hexidecimal digits representing its octet value. See section
|
413
|
+
2.4.1 of [RFC2396] for an exact definition. Unless it is used as a
|
414
|
+
delimiter of the URI scheme itself, any character may optionally be
|
415
|
+
encoded this way; systems that are testing URI syntax should be
|
416
|
+
prepared for these encodings to appear in any component of the URI
|
417
|
+
except the scheme name itself.
|
418
|
+
|
419
|
+
A "generic URI" syntax is specified and is more restrictive, but
|
420
|
+
using it to test URI strings requires that one know whether or not
|
421
|
+
the particular scheme in use obeys that syntax. Consequently,
|
422
|
+
applications that intend to check or validate URIs should normally
|
423
|
+
identify the scheme name and then apply scheme-specific tests. The
|
424
|
+
rules for two of those -- HTTP [RFC1738] and MAILTO [RFC2368] URLs --
|
425
|
+
are discussed below, but the author of an application which intends
|
426
|
+
to make very precise checks, or to reject particular syntax rather
|
427
|
+
than just warning the user, should consult the relevant scheme-
|
428
|
+
definition documents for precise syntax and relationships.
|
429
|
+
|
430
|
+
4.2. The HTTP URL
|
431
|
+
|
432
|
+
Absolute HTTP URLs consist of the scheme name, a host name (expressed
|
433
|
+
as a domain name or IP address), and optional port number, and then,
|
434
|
+
optionally, a path, a search part, and a fragment identifier. These
|
435
|
+
are separated, respectively, by a colon and the two slashes that
|
436
|
+
precede the host name, a colon, a slash, a question mark, and a hash
|
437
|
+
mark ("#"). So we have
|
438
|
+
|
439
|
+
http://host:port/path?search#fragment
|
440
|
+
|
441
|
+
http://host/path/
|
442
|
+
|
443
|
+
http://host/path#fragment
|
444
|
+
|
445
|
+
|
446
|
+
|
447
|
+
Klensin Informational [Page 8]
|
448
|
+
|
449
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
450
|
+
|
451
|
+
|
452
|
+
http://host/path?search
|
453
|
+
|
454
|
+
http://host
|
455
|
+
|
456
|
+
and other variations on that form. There is also a "relative" form,
|
457
|
+
but it almost never appears in text that a user might, e.g., enter
|
458
|
+
into a form. See [RFC2616] for details.
|
459
|
+
|
460
|
+
The characters
|
461
|
+
|
462
|
+
/ ; ?
|
463
|
+
|
464
|
+
are reserved within the path and search parts and must be encoded;
|
465
|
+
the first of these may be used unencoded, and is often used within
|
466
|
+
the path, to designate hierarchy.
|
467
|
+
|
468
|
+
4.3. The MAILTO URL
|
469
|
+
|
470
|
+
MAILTO is a URL type whose content is an email address. It can be
|
471
|
+
used to encode any of the email address formats discussed in Section
|
472
|
+
3 above. It can also support multiple addresses and the inclusion of
|
473
|
+
headers (e.g., Subject lines) within the body of the URL. MAILTO is
|
474
|
+
authoritatively defined in RFC 2368 [RFC2368]; anyone expecting to
|
475
|
+
accept and test multiple addresses or mail header or body formats
|
476
|
+
should consult that document carefully.
|
477
|
+
|
478
|
+
In accepting text for, or validating, a MAILTO URL, it is important
|
479
|
+
to note that, while it can be used to encode any valid email address,
|
480
|
+
it is not sufficient to copy an email address into a MAILTO URL since
|
481
|
+
email addresses may include a number of characters that are invalid
|
482
|
+
in, or have reserved uses for, URLs. Those characters must be
|
483
|
+
encoded, as outlined in Section 4.1 above, when the addresses are
|
484
|
+
mapped into the URL form. Conversely, addresses in MAILTO URLs
|
485
|
+
cannot, in general, be copied directly into email contexts, since few
|
486
|
+
email programs will reverse the decodings (and doing so might be
|
487
|
+
interpreted as a protocol violation).
|
488
|
+
|
489
|
+
The following characters may appear in MAILTO URLs only with the
|
490
|
+
specific defined meanings given. If they appear in an email address
|
491
|
+
(i.e., for some other purpose), they must be encoded:
|
492
|
+
|
493
|
+
: The colon in "mailto:"
|
494
|
+
|
495
|
+
< > # " % { } | \ ^ ~ `
|
496
|
+
|
497
|
+
These characters are "unsafe" in any URL, and must always be
|
498
|
+
encoded.
|
499
|
+
|
500
|
+
|
501
|
+
|
502
|
+
|
503
|
+
Klensin Informational [Page 9]
|
504
|
+
|
505
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
506
|
+
|
507
|
+
|
508
|
+
The following characters must also be encoded if they appear in a
|
509
|
+
MAILTO URL
|
510
|
+
|
511
|
+
? & =
|
512
|
+
Used to delimit headers and their values when these are encoded
|
513
|
+
into URLs.
|
514
|
+
|
515
|
+
Some examples may be helpful:
|
516
|
+
|
517
|
+
+-------------------------+-----------------------------+-----------+
|
518
|
+
| Email address | MAILTO URL | Notes |
|
519
|
+
+-------------------------+-----------------------------+-----------+
|
520
|
+
| Joe@example.com | mailto:joe@example.com | 1 |
|
521
|
+
| | | |
|
522
|
+
| user+mailbox@example | mailto: | 2 |
|
523
|
+
| .com | user%2Bmailbox@example | |
|
524
|
+
| | .com | |
|
525
|
+
| | | |
|
526
|
+
| customer/department= | mailto:customer%2F | 3 |
|
527
|
+
| shipping@example.com | department=shipping@example | |
|
528
|
+
| | .com | |
|
529
|
+
| | | |
|
530
|
+
| $A12345@example.com | mailto:$A12345@example | 4 |
|
531
|
+
| | .com | |
|
532
|
+
| | | |
|
533
|
+
| !def!xyz%abc@example | mailto:!def!xyz%25abc | 5 |
|
534
|
+
| .com | @example.com | |
|
535
|
+
| | | |
|
536
|
+
| _somename@example.com | mailto:_somename@example | 4 |
|
537
|
+
| | .com | |
|
538
|
+
+-------------------------+-----------------------------+-----------+
|
539
|
+
|
540
|
+
Table 1
|
541
|
+
|
542
|
+
Notes on Table
|
543
|
+
|
544
|
+
1. No characters appear in the email address that require escaping,
|
545
|
+
so the body of the MAILTO URL is identical to the email address.
|
546
|
+
|
547
|
+
2. There is actually some uncertainty as to whether or not the "+"
|
548
|
+
characters requires escaping in MAILTO URLs (the standards are
|
549
|
+
not precisely clear). But, since any character in the address
|
550
|
+
specification may optionally be encoded, it is probably safer to
|
551
|
+
encode it.
|
552
|
+
|
553
|
+
3. The "/" character is generally reserved in URLs, and must be
|
554
|
+
encoded as %2F.
|
555
|
+
|
556
|
+
|
557
|
+
|
558
|
+
|
559
|
+
Klensin Informational [Page 10]
|
560
|
+
|
561
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
562
|
+
|
563
|
+
|
564
|
+
4. Neither the "$" nor the "_" character are given any special
|
565
|
+
interpretation in MAILTO URLs, so need not be encoded.
|
566
|
+
|
567
|
+
5. While the "!" character has no special interpretation, the "%"
|
568
|
+
character is used to introduce encoded sequences and hence it
|
569
|
+
must always be encoded.
|
570
|
+
|
571
|
+
4.4. Guessing domain names in web contexts
|
572
|
+
|
573
|
+
Several web browsers have adopted a practice that permits an
|
574
|
+
incomplete domain name to be used as input instead of a complete URL.
|
575
|
+
This has, for example, permitted users to type "microsoft" and have
|
576
|
+
the browser interpret the input as "http://www.microsoft.com/".
|
577
|
+
Other browser versions have gone even further, trying to build DNS
|
578
|
+
names up through a series of heuristics, testing each variation in
|
579
|
+
turn to see if it appears in the DNS, and accepting the first one
|
580
|
+
found as the intended domain name. Still, others automatically
|
581
|
+
invoke search engines if no period appears or if the reference fails.
|
582
|
+
If any of these approaches are to be used, it is often critical that
|
583
|
+
the browser recognize the complete list of TLDs. If an incomplete
|
584
|
+
list is used, complete domain names may not be recognized as such and
|
585
|
+
the system may try to turn them into completely different names. For
|
586
|
+
example, "example.aero" is a fully-qualified name, since "AERO." is a
|
587
|
+
TLD name. But, if the system doesn't recognize "AERO" as a TLD name,
|
588
|
+
it is likely to try to look up "example.aero.com" and
|
589
|
+
"www.example.aero.com" (and then fail or find the wrong host), rather
|
590
|
+
than simply looking up the user-supplied name.
|
591
|
+
|
592
|
+
As discussed in Section 2 above, there are dangers associated with
|
593
|
+
software that attempts to "know" the list of top-level domain names
|
594
|
+
locally and take advantage of that knowledge. These name-guessing
|
595
|
+
heuristics are another example of that situation: if the lists are
|
596
|
+
up-to-date and used carefully, the systems in which they are embedded
|
597
|
+
may provide an easier, and more attractive, experience for at least
|
598
|
+
some users. But finding the wrong host, or being unable to find a
|
599
|
+
host even when its name is precisely known, constitute bad
|
600
|
+
experiences by any measure.
|
601
|
+
|
602
|
+
More generally, there have been bad experiences with attempts to
|
603
|
+
"complete" domain names by adding additional information to them.
|
604
|
+
These issues are described in some detail in RFC 1535 [RFC1535].
|
605
|
+
|
606
|
+
5. Implications of internationalization
|
607
|
+
|
608
|
+
The IETF has adopted a series of proposals ([RFC3490] - [RFC3492])
|
609
|
+
whose purpose is to permit encoding internationalized (i.e., non-
|
610
|
+
ASCII) names in the DNS. The primary standard, and the group
|
611
|
+
generically, are known as "IDNA". The actual strings stored in the
|
612
|
+
|
613
|
+
|
614
|
+
|
615
|
+
Klensin Informational [Page 11]
|
616
|
+
|
617
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
618
|
+
|
619
|
+
|
620
|
+
DNS are in an encoded form: the labels begin with the characters
|
621
|
+
"xn--" followed by the encoded string. Applications should be
|
622
|
+
prepared to accept and process the encoded form (those strings are
|
623
|
+
consistent with the "LDH rule" (see Section 2) so should not raise
|
624
|
+
any separate issues) and the use of local, and potentially other,
|
625
|
+
characters as appropriate to local systems and circumstances.
|
626
|
+
|
627
|
+
The IDNA specification describes the exact process to be used to
|
628
|
+
validate a name or encoded string. The process is sufficiently
|
629
|
+
complex that shortcuts or heuristics, especially for versions of
|
630
|
+
labels written directly in Unicode or other coded character sets, are
|
631
|
+
likely to fail and cause problems. In particular, the strings cannot
|
632
|
+
be validated with syntax or semantic rules of any of the usual sorts:
|
633
|
+
syntax validity is defined only in terms of the result of executing a
|
634
|
+
particular function.
|
635
|
+
|
636
|
+
In addition to the restrictions imposed by the protocols themselves,
|
637
|
+
many domains are implementing rules about just which non-ASCII names
|
638
|
+
they will permit to be registered (see, e.g., [JET], [RegRestr]).
|
639
|
+
This work is still relatively new, and the rules and conventions are
|
640
|
+
likely to be different for each domain, or at least each language or
|
641
|
+
script group. Attempting to test for those rules in a client program
|
642
|
+
to see if a user-supplied name might possibly exist in the relevant
|
643
|
+
domain would almost certainly be ill-advised.
|
644
|
+
|
645
|
+
One quick local test however, may be reasonable: as of the time of
|
646
|
+
this writing, there should be no instances of labels in the DNS that
|
647
|
+
start with two characters, followed by two hyphens, where the two
|
648
|
+
characters are not "xn" (in, of course, either upper or lower case).
|
649
|
+
Such label strings, if they appear, are probably erroneous or
|
650
|
+
obsolete, and it may be reasonable to at least warn the user about
|
651
|
+
them.
|
652
|
+
|
653
|
+
There is ongoing work in the IETF and elsewhere to define
|
654
|
+
internationalized formats for use in other protocols, including email
|
655
|
+
addresses. Those forms may or may not conform to existing rules for
|
656
|
+
ASCII-only identifiers; anyone designing evaluators or filters should
|
657
|
+
watch that work closely.
|
658
|
+
|
659
|
+
6. Summary
|
660
|
+
|
661
|
+
When an application accepts a string from the user and ultimately
|
662
|
+
passes it on to an API for a protocol, the desirability of testing or
|
663
|
+
filtering the text in any way not required by the protocol itself is
|
664
|
+
hotly debated. If it must divide the string into its components, or
|
665
|
+
otherwise interpret it, it obviously must make at least enough tests
|
666
|
+
to validate that process. With, e.g., domain names or email
|
667
|
+
addresses that can be passed on untouched, the appropriateness of
|
668
|
+
|
669
|
+
|
670
|
+
|
671
|
+
Klensin Informational [Page 12]
|
672
|
+
|
673
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
674
|
+
|
675
|
+
|
676
|
+
trying to figure out which ones are valid and which ones are not
|
677
|
+
requires a more complex decision, one that should include
|
678
|
+
considerations of how to make exactly the correct tests and to keep
|
679
|
+
information that changes and evolves up-to-date. A test containing
|
680
|
+
obsolete information, can be extremely frustrating for potential
|
681
|
+
correspondents or customers and may harm desired relationships.
|
682
|
+
|
683
|
+
7. Security Considerations
|
684
|
+
|
685
|
+
Since this document merely summarizes the requirements of existing
|
686
|
+
standards, it does not introduce any new security issues. However,
|
687
|
+
many of the techniques that motivate the document raise important
|
688
|
+
security concerns of their own. Rejecting valid forms of domain
|
689
|
+
names, email addresses, or URIs often denies service to the user of
|
690
|
+
those entities. Worse, guessing at the user's intent when an
|
691
|
+
incomplete address, or other string, is given can result in
|
692
|
+
compromises to privacy or accuracy of reference if the wrong target
|
693
|
+
is found and returned. From a security standpoint, the optimum
|
694
|
+
behavior is probably to never guess, but instead, to force the user
|
695
|
+
to specify exactly what is wanted. When that position involves a
|
696
|
+
tradeoff with an acceptable user experience, good judgment should be
|
697
|
+
used and the fact that it is a tradeoff recognized.
|
698
|
+
|
699
|
+
Some characters have special or privileged meanings on some systems
|
700
|
+
(i.e., ` on Unix). Applications should be careful to escape those
|
701
|
+
locally if necessary. By the same token, they are valid, and should
|
702
|
+
not be disallowed locally, or escaped when transmitted through
|
703
|
+
Internet protocols, for such reasons if a remote site chooses to use
|
704
|
+
them.
|
705
|
+
|
706
|
+
The presence of local checking does not permit remote checking to be
|
707
|
+
bypassed. Note that this can apply to a single machine; in
|
708
|
+
particular, a local MTA should not assume that a local MUA has
|
709
|
+
properly escaped locally-significant special characters.
|
710
|
+
|
711
|
+
8. Acknowledgements
|
712
|
+
|
713
|
+
The author would like to express his appreciation for helpful
|
714
|
+
comments from Harald Alvestrand, Eric A. Hall, and the RFC Editor,
|
715
|
+
and for partial support of this work from SITA. Responsibility for
|
716
|
+
any errors remains, of course, with the author.
|
717
|
+
|
718
|
+
The first Internet-Draft on this subject was posted in February 2003.
|
719
|
+
The document was submitted to the RFC Editor on 20 June 2003,
|
720
|
+
returned for revisions on 19 August, and resubmitted on 5 September
|
721
|
+
2003.
|
722
|
+
|
723
|
+
|
724
|
+
|
725
|
+
|
726
|
+
|
727
|
+
Klensin Informational [Page 13]
|
728
|
+
|
729
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
730
|
+
|
731
|
+
|
732
|
+
9. References
|
733
|
+
|
734
|
+
9.1. Normative References
|
735
|
+
|
736
|
+
[RFC1035] Mockapetris, P., "Domain names - implementation and
|
737
|
+
specification", STD 13, RFC 1035, November 1987.
|
738
|
+
|
739
|
+
[RFC1123] Braden, R., Ed., "Requirements for Internet Hosts -
|
740
|
+
Application and Support", STD 3, RFC 1123, October
|
741
|
+
1989.
|
742
|
+
|
743
|
+
[RFC1535] Gavron, E., "A Security Problem and Proposed
|
744
|
+
Correction With Widely Deployed DNS Software", RFC
|
745
|
+
1535, October 1993.
|
746
|
+
|
747
|
+
[RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill,
|
748
|
+
"Uniform Resource Locators (URL)", RFC 1738, December
|
749
|
+
1994.
|
750
|
+
|
751
|
+
[RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS
|
752
|
+
Specification", RFC 2181, July 1997.
|
753
|
+
|
754
|
+
[RFC2368] Hoffman, P., Masinter, L. and J. Zawinski, "The
|
755
|
+
mailto URL scheme", RFC 2368, July 1998.
|
756
|
+
|
757
|
+
[RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter,
|
758
|
+
"Uniform Resource Identifiers (URI): Generic Syntax",
|
759
|
+
RFC 2396, August 1998.
|
760
|
+
|
761
|
+
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
|
762
|
+
Masinter, L., Leach, P. and T. Berners-Lee,
|
763
|
+
"Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616,
|
764
|
+
June 1999.
|
765
|
+
|
766
|
+
[RFC2821] Klensin, J., Ed., "Simple Mail Transfer Protocol",
|
767
|
+
RFC 2821, April 2001.
|
768
|
+
|
769
|
+
[RFC2822] Resnick, P., Ed., "Internet Message Format", RFC
|
770
|
+
2822, April 2001.
|
771
|
+
|
772
|
+
[RFC3490] Faltstrom, P., Hoffman, P. and A. Costello,
|
773
|
+
"Internationalizing Domain Names in Applications
|
774
|
+
(IDNA)", RFC 3490, March 2003.
|
775
|
+
|
776
|
+
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
|
777
|
+
Profile for Internationalized Domain Names (IDN)",
|
778
|
+
RFC 3491, March 2003.
|
779
|
+
|
780
|
+
|
781
|
+
|
782
|
+
|
783
|
+
Klensin Informational [Page 14]
|
784
|
+
|
785
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
786
|
+
|
787
|
+
|
788
|
+
[RFC3492] Costello, A., "Punycode: A Bootstring encoding of
|
789
|
+
Unicode for Internationalized Domain Names in
|
790
|
+
Applications (IDNA)", RFC 3492, March 2003.
|
791
|
+
|
792
|
+
[ASCII] American National Standards Institute (formerly
|
793
|
+
United States of America Standards Institute), "USA
|
794
|
+
Code for Information Interchange", ANSI X3.4-1968.
|
795
|
+
ANSI X3.4-1968 has been replaced by newer versions
|
796
|
+
with slight modifications, but the 1968 version
|
797
|
+
remains definitive for the Internet.
|
798
|
+
|
799
|
+
[DomainList] Internet Assigned Numbers Authority (IANA), Untitled
|
800
|
+
alphabetical list of current top-level domains.
|
801
|
+
http://data.iana.org/TLD/tlds-alpha-by-domain.txt
|
802
|
+
ftp://data.iana.org/TLD/tlds-alpha-by-domain.txt
|
803
|
+
|
804
|
+
9.2. Informative References
|
805
|
+
|
806
|
+
[ISO.3166.1988] International Organization for Standardization,
|
807
|
+
"Codes for the representation of names of countries,
|
808
|
+
3rd edition", ISO Standard 3166, August 1988.
|
809
|
+
|
810
|
+
[JET] Konishi, K., et al., "Internationalized Domain Names
|
811
|
+
Registration and Administration Guideline for
|
812
|
+
Chinese, Japanese and Korean", Work in Progress.
|
813
|
+
|
814
|
+
[RFC1591] Postel, J., "Domain Name System Structure and
|
815
|
+
Delegation", RFC 1591, March 1994.
|
816
|
+
|
817
|
+
[RegRestr] Klensin, J., "Registration of Internationalized
|
818
|
+
Domain Names: Overview and Method", Work in Progress,
|
819
|
+
February 2004.
|
820
|
+
|
821
|
+
10. Author's Address
|
822
|
+
|
823
|
+
John C Klensin
|
824
|
+
1770 Massachusetts Ave, #322
|
825
|
+
Cambridge, MA 02140
|
826
|
+
USA
|
827
|
+
|
828
|
+
Phone: +1 617 491 5735
|
829
|
+
EMail: john-ietf@jck.com
|
830
|
+
|
831
|
+
|
832
|
+
|
833
|
+
|
834
|
+
|
835
|
+
|
836
|
+
|
837
|
+
|
838
|
+
|
839
|
+
Klensin Informational [Page 15]
|
840
|
+
|
841
|
+
RFC 3696 Checking and Transformation of Names February 2004
|
842
|
+
|
843
|
+
|
844
|
+
11. Full Copyright Statement
|
845
|
+
|
846
|
+
Copyright (C) The Internet Society (2004). This document is subject
|
847
|
+
to the rights, licenses and restrictions contained in BCP 78 and
|
848
|
+
except as set forth therein, the authors retain all their rights.
|
849
|
+
|
850
|
+
This document and the information contained herein are provided on an
|
851
|
+
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
|
852
|
+
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
|
853
|
+
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
|
854
|
+
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
|
855
|
+
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
|
856
|
+
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
857
|
+
|
858
|
+
Intellectual Property
|
859
|
+
|
860
|
+
The IETF takes no position regarding the validity or scope of any
|
861
|
+
Intellectual Property Rights or other rights that might be claimed to
|
862
|
+
pertain to the implementation or use of the technology described in
|
863
|
+
this document or the extent to which any license under such rights
|
864
|
+
might or might not be available; nor does it represent that it has
|
865
|
+
made any independent effort to identify any such rights. Information
|
866
|
+
on the procedures with respect to rights in RFC documents can be
|
867
|
+
found in BCP 78 and BCP 79.
|
868
|
+
|
869
|
+
Copies of IPR disclosures made to the IETF Secretariat and any
|
870
|
+
assurances of licenses to be made available, or the result of an
|
871
|
+
attempt made to obtain a general license or permission for the use of
|
872
|
+
such proprietary rights by implementers or users of this
|
873
|
+
specification can be obtained from the IETF on-line IPR repository at
|
874
|
+
http://www.ietf.org/ipr.
|
875
|
+
|
876
|
+
The IETF invites any interested party to bring to its attention any
|
877
|
+
copyrights, patents or patent applications, or other proprietary
|
878
|
+
rights that may cover technology that may be required to implement
|
879
|
+
this standard. Please address the information to the IETF at ietf-
|
880
|
+
ipr@ietf.org.
|
881
|
+
|
882
|
+
Acknowledgement
|
883
|
+
|
884
|
+
Funding for the RFC Editor function is currently provided by the
|
885
|
+
Internet Society.
|
886
|
+
|
887
|
+
|
888
|
+
|
889
|
+
|
890
|
+
|
891
|
+
|
892
|
+
|
893
|
+
|
894
|
+
|
895
|
+
Klensin Informational [Page 16]
|
896
|
+
|
897
|
+
|
898
|
+
Html markup produced by rfcmarkup 1.77, available from http://tools.ietf.org/tools/rfcmarkup/
|