net/cert/internal/verify_name_match.cc - Issue 1125333005: RFC 2459 name comparison.

Side by Side Diff: net/cert/internal/verify_name_match.cc

Issue 1125333005: RFC 2459 name comparison. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: fix win build Created 5 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 // Copyright 2015 The Chromium Authors. All rights reserved.	1 // Copyright 2015 The Chromium Authors. All rights reserved.

2 // Use of this source code is governed by a BSD-style license that can be	2 // Use of this source code is governed by a BSD-style license that can be

3 // found in the LICENSE file.	3 // found in the LICENSE file.

4	4

5 #include "net/cert/internal/verify_name_match.h"	5 #include "net/cert/internal/verify_name_match.h"

	6

	7 #include <string.h>

	8

	9 #include "base/stl_util.h"

	10 #include "base/strings/string16.h"

	11 #include "base/strings/string_util.h"

	12 #include "base/strings/utf_string_conversion_utils.h"

	13 #include "base/strings/utf_string_conversions.h"

	14 #include "base/sys_byteorder.h"

	15 #include "base/third_party/icu/icu_utf.h"

	16 #include "base/tuple.h"

6 #include "net/der/input.h"	17 #include "net/der/input.h"

	18 #include "net/der/parser.h"

	19 #include "net/der/tag.h"

7	20

8 namespace net {	21 namespace net {

9	22

	23 namespace {

	24

	25 // Types of character set checking that NormalizeDirectoryString can perform.

	26 enum CharsetEnforcement {

	27 NO_ENFORCEMENT,

	28 ENFORCE_PRINTABLE_STRING,

	29 ENFORCE_ASCII,

	30 };

	31

	32 // Normalizes \|output\|, a UTF-8 encoded string, as if it contained

	33 // only ASCII characters.

	34 //

	35 // This could be considered a partial subset of RFC 5280 rules, and

	36 // is compatible with RFC 2459/3280.

	37 //

	38 // In particular, RFC 5280, Section 7.1 describes how UTF8String

	39 // and PrintableString should be compared - using the LDAP StringPrep

	40 // profile of RFC 4518, with case folding and whitespace compression.

	41 // However, because it is optional for implementations and because

	42 // it's desirable to avoid the size cost of the StringPrep tables,

	43 // this function treats \|output\| as if it was composed of ASCII.

	44 //

	45 // That is, rather than folding all whitespace characters, it only

	46 // folds ' '. Rather than case folding using locale-aware handling,

	47 // it only folds A-Z to a-z.

	48 //

	49 // This gives better results than outright rejecting (due to mismatched

	50 // encodings), or from doing a strict binary comparison (the minimum

	51 // required by RFC 3280), and is sufficient for those certificates

	52 // publicly deployed.

	53 //

	54 // If \|charset_enforcement\| is not NO_ENFORCEMENT and \|output\| contains any

	55 // characters not allowed in the specified charset, returns false.

	56 //

	57 // NOTE: \|output\| will be modified regardless of the return.

	58 WARN_UNUSED_RESULT bool NormalizeDirectoryString(

	59 CharsetEnforcement charset_enforcement,

	60 std::string* output) {

	61 // Normalized version will always be equal or shorter than input.

	62 // Normalize in place and then truncate the output if necessary.

	63 std::string::const_iterator read_iter = output->begin();

	64 std::string::iterator write_iter = output->begin();

	65

	66 for (; read_iter != output->end() && *read_iter == ' '; ++read_iter) {

	67 // Ignore leading whitespace.

	68 }

	69

	70 for (; read_iter != output->end(); ++read_iter) {

	71 const unsigned char c = *read_iter;

	72 if (c == ' ') {

	73 // If there are non-whitespace characters remaining in input, compress

	74 // multiple whitespace chars to a single space, otherwise ignore trailing

	75 // whitespace.

	76 std::string::const_iterator next_iter = read_iter + 1;

	77 if (next_iter != output->end() && *next_iter != ' ')

	78 *(write_iter++) = ' ';

	79 } else if (c >= 'A' && c <= 'Z') {

	80 // Fold case.

	81 *(write_iter++) = c + ('a' - 'A');

	82 } else {

	83 // Note that these checks depend on the characters allowed by earlier

	84 // conditions also being valid for the enforced charset.

	85 switch (charset_enforcement) {

	86 case ENFORCE_PRINTABLE_STRING:

	87 if (!((c >= 'a' && c <= 'z') \|\| (c >= '\'' && c <= ':') \|\| c == '=' \|\|

	88 c == '?'))

	89 return false;

	90 break;

	91 case ENFORCE_ASCII:

	92 if (c > 0x7F)

	93 return false;

	94 break;

	95 case NO_ENFORCEMENT:

	96 break;

	97 }

	98 *(write_iter++) = c;

	99 }

	100 }

	101 if (write_iter != output->end())

	102 output->erase(write_iter, output->end());

	103 return true;

	104 }

	105

	106 // Normalizes the DER-encoded PrintableString value \|in\| according to

	107 // RFC 2459, Section 4.1.2.4

	108 //

	109 // Briefly, normalization involves removing leading and trailing

	110 // whitespace, folding multiple whitespace characters into a single

	111 // whitespace character, and normalizing on case (this function

	112 // normalizes to lowercase).

	113 //

	114 // During normalization, this function also validates that \|in\|

	115 // is properly encoded - that is, that it restricts to the character

	116 // set defined in X.680 (2008), Section 41.4, Table 10. X.680 defines

	117 // the valid characters as

	118 // a-z A-Z 0-9 (space) ' ( ) + , - . / : = ?

	119 //

	120 // However, due to an old OpenSSL encoding bug, a number of

	121 // certificates have also included '*', which has historically been

	122 // allowed by implementations, and so is also allowed here.

	123 //

	124 // If \|in\| can be normalized, returns true and sets \|output\| to the

	125 // case folded, normalized value. If \|in\| is invalid, returns false.

	126 // NOTE: \|output\| will be modified regardless of the return.

	127 WARN_UNUSED_RESULT bool NormalizePrintableStringValue(const der::Input& in,

	128 std::string* output) {

	129 in.CopyToString(output);

	130 return NormalizeDirectoryString(ENFORCE_PRINTABLE_STRING, output);

	131 }

	132

	133 // Normalized a UTF8String value. See the comment for NormalizeDirectoryString

	134 // for details.

	135 //

	136 // If \|in\| can be normalized, returns true and sets \|output\| to the

	137 // case folded, normalized value. If \|in\| is invalid, returns false.

	138 // NOTE: \|output\| will be modified regardless of the return.

	139 WARN_UNUSED_RESULT bool NormalizeUtf8StringValue(const der::Input& in,

	140 std::string* output) {

	141 in.CopyToString(output);

	142 return NormalizeDirectoryString(NO_ENFORCEMENT, output);

	143 }

	144

	145 // IA5String is ISO/IEC Registrations 1 and 6 from the ISO

	146 // "International Register of Coded Character Sets to be used

	147 // with Escape Sequences", plus space and delete. That's just the

	148 // polite way of saying 0x00 - 0x7F, aka ASCII (or, more formally,

	149 // ISO/IEC 646)

	150 //

	151 // If \|in\| can be normalized, returns true and sets \|output\| to the case folded,

	152 // normalized value. If \|in\| is invalid, returns false.

	153 // NOTE: \|output\| will be modified regardless of the return.

	154 WARN_UNUSED_RESULT bool NormalizeIA5StringValue(const der::Input& in,

	155 std::string* output) {

	156 in.CopyToString(output);

	157 return NormalizeDirectoryString(ENFORCE_ASCII, output);

	158 }

	159

	160 // Converts BMPString value to UTF-8 and then normalizes it. See the comment for

	161 // NormalizeDirectoryString for details.

	162 //

	163 // If \|in\| can be normalized, returns true and sets \|output\| to the case folded,

	164 // normalized value. If \|in\| is invalid, returns false.

	165 // NOTE: \|output\| will be modified regardless of the return.

	166 WARN_UNUSED_RESULT bool NormalizeBmpStringValue(const der::Input& in,

	167 std::string* output) {

	168 if (in.Length() % 2 != 0)

	169 return false;

	170

	171 base::string16 in_16bit;

	172 memcpy(base::WriteInto(&in_16bit, in.Length() / 2 + 1), in.UnsafeData(),

	173 in.Length());

	174 for (base::char16& c : in_16bit) {

	175 // BMPString is UCS-2 in big-endian order.

	176 c = base::NetToHost16(c);

	177

	178 // BMPString only supports codepoints in the Basic Multilingual Plane;

	179 // surrogates are not allowed.

	180 if (CBU_IS_SURROGATE(c))

	181 return false;

	182 }

	183 if (!base::UTF16ToUTF8(in_16bit.data(), in_16bit.size(), output))

	184 return false;

	185 return NormalizeDirectoryString(NO_ENFORCEMENT, output);

	186 }

	187

	188 // Converts UniversalString value to UTF-8 and then normalizes it. See the

	189 // comment for NormalizeDirectoryString for details.

	190 //

	191 // If \|in\| can be normalized, returns true and sets \|output\| to the case folded,

	192 // normalized value. If \|in\| is invalid, returns false.

	193 // NOTE: \|output\| will be modified regardless of the return.

	194 WARN_UNUSED_RESULT bool NormalizeUniversalStringValue(const der::Input& in,

	195 std::string* output) {

	196 if (in.Length() % 4 != 0)

	197 return false;

	198

	199 std::vector<uint32_t> in_32bit(in.Length() / 4);

	200 memcpy(vector_as_array(&in_32bit), in.UnsafeData(), in.Length());
	Ryan Sleevi 2015/07/17 21:28:59 What if in.Length() == 0? I can't remember what IS What if in.Length() == 0? I can't remember what ISO C guarantees for 0 lengths. mattm 2015/07/21 00:17:47 Oh, yeah. Zero lengths are okay, but the pointers Show quoted text On 2015/07/17 21:28:59, Ryan Sleevi wrote: > What if in.Length() == 0? I can't remember what ISO C guarantees for 0 lengths. Oh, yeah. Zero lengths are okay, but the pointers still need to be valid. I've added a check.
	201 for (const uint32_t c : in_32bit) {

	202 // UniversalString is UCS-4 in big-endian order.

	203 uint32_t codepoint = base::NetToHost32(c);

	204 if (!CBU_IS_UNICODE_CHAR(codepoint))

	205 return false;

	206

	207 base::WriteUnicodeCharacter(codepoint, output);

	208 }

	209 return NormalizeDirectoryString(NO_ENFORCEMENT, output);

	210 }

	211

	212 // Converts the string \|value\| to UTF-8, normalizes it, and stores in \|output\|.

	213 // \|tag\| must one of the types for which IsNormalizableDirectoryString is true.

	214 //

	215 // If \|value\| can be normalized, returns true and sets \|output\| to the case

	216 // folded, normalized value. If \|value\| is invalid, returns false.

	217 // NOTE: \|output\| will be modified regardless of the return.

	218 WARN_UNUSED_RESULT bool NormalizeValue(const der::Tag tag,

	219 const der::Input& value,

	220 std::string* output) {

	221 switch (tag) {

	222 case der::kPrintableString:

	223 return NormalizePrintableStringValue(value, output);

	224 case der::kUtf8String:

	225 return NormalizeUtf8StringValue(value, output);

	226 case der::kIA5String:

	227 return NormalizeIA5StringValue(value, output);

	228 case der::kUniversalString:

	229 return NormalizeUniversalStringValue(value, output);

	230 case der::kBmpString:

	231 return NormalizeBmpStringValue(value, output);

	232 default:

	233 NOTREACHED();

	234 return false;

	235 }

	236 }

	237

	238 // Returns true if \|tag\| is a string type that NormalizeValue can handle.

	239 bool IsNormalizableDirectoryString(der::Tag tag) {

	240 switch (tag) {

	241 case der::kPrintableString:

	242 case der::kUtf8String:

	243 // RFC 5280 only requires handling IA5String for comparing domainComponent

	244 // values, but handling it here avoids the need to special case anything.

	245 case der::kIA5String:

	246 case der::kUniversalString:

	247 case der::kBmpString:

	248 return true;

	249 // TeletexString isn't normalized. Section 8 of RFC 5280 briefly

	250 // describes the historical confusion between treating TeletexString

	251 // as Latin1String vs T.61, and there are even incompatibilities within

	252 // T.61 implementations. As this time is virtually unused, simply

	253 // treat it with a binary comparison, as permitted by RFC 3280/5280.

	254 default:

	255 return false;

	256 }

	257 }

	258

	259 bool VerifyValueMatch(const der::Tag a_tag,

	260 const der::Input& a_value,

	261 const der::Tag b_tag,

	262 const der::Input& b_value) {

	263 if (IsNormalizableDirectoryString(a_tag) &&

	264 IsNormalizableDirectoryString(b_tag)) {

	265 std::string a_normalized, b_normalized;

	266 if (!NormalizeValue(a_tag, a_value, &a_normalized) \|\|

	267 !NormalizeValue(b_tag, b_value, &b_normalized))

	268 return false;

	269 return a_normalized == b_normalized;

	270 }

	271 // Attributes encoded with different types may be assumed to be unequal.

	272 if (a_tag != b_tag)

	273 return false;

	274 // All other types use binary comparison.

	275 return a_value.Equals(b_value);

	276 }

	277

	278 // Vector of Tuple<Attribute Type, Attribute Value tag, Attribute Value>.

	279 using RdnVector = std::vector<base::Tuple<der::Input, der::Tag, der::Input>>;
	Ryan Sleevi 2015/07/17 21:28:59 Should we just treat this as a struct with named m Should we just treat this as a struct with named members? It seems like that'd be totally legal here. In particular, I found lines 327-328 fairly hard to read to understand whether or not there's typos or anything in place (note only 4 meaningful characters different in those two lines, and it required care to make sure numbers weren't typo'd) mattm 2015/07/21 00:17:47 Done. Also changed the loop variables to just "a" Show quoted text On 2015/07/17 21:28:59, Ryan Sleevi wrote: > Should we just treat this as a struct with named members? It seems like that'd > be totally legal here. > > In particular, I found lines 327-328 fairly hard to read to understand whether > or not there's typos or anything in place (note only 4 meaningful characters > different in those two lines, and it required care to make sure numbers weren't > typo'd) Done. Also changed the loop variables to just "a" and "b" since having foo_type_and_value repeated so many times just made things hard to parse.
	280

	281 WARN_UNUSED_RESULT bool ReadRdn(der::Parser* parser, RdnVector* out) {
	Ryan Sleevi 2015/07/17 21:29:00 Document Document mattm 2015/07/21 00:17:47 Done. Show quoted text On 2015/07/17 21:29:00, Ryan Sleevi wrote: > Document Done.
	282 while (parser->HasMore()) {

	283 der::Parser attr_type_and_value;

	284 if (!parser->ReadSequence(&attr_type_and_value))

	285 return false;

	286 // Read the attribute type, which must be OBJECT IDENTIFIERs.
	Ryan Sleevi 2015/07/17 21:29:00 Why the plural? Why the plural? mattm 2015/07/21 00:17:47 fixed Show quoted text On 2015/07/17 21:29:00, Ryan Sleevi wrote: > Why the plural? fixed
	287 der::Input type;

	288 if (!attr_type_and_value.ReadTag(der::kOid, &type))

	289 return false;

	290

	291 // Read the attribute value.

	292 der::Tag tag;

	293 der::Input value;

	294 if (!attr_type_and_value.ReadTagAndValue(&tag, &value))

	295 return false;

	296

	297 // There should be no more elements in the sequence after reading the

	298 // attribute type and value.

	299 if (attr_type_and_value.HasMore())

	300 return false;

	301

	302 out->push_back(base::MakeTuple(type, tag, value));

	303 }

	304 return true;

	305 }

	306

	307 // Verifies that \|a\| and \|b\| are the same length and that every

	308 // AttributeTypeAndValue in \|a\| has a matching AttributeTypeAndValue in \|b\|.

	309 bool VerifyRdnMatch(der::Parser* a, der::Parser* b) {

	310 RdnVector a_type_and_values, b_type_and_values;

	311 if (!ReadRdn(a, &a_type_and_values) \|\| !ReadRdn(b, &b_type_and_values))

	312 return false;

	313

	314 if (a_type_and_values.empty() \|\| b_type_and_values.empty() \|\|

	315 a_type_and_values.size() != b_type_and_values.size())

	316 return false;

	317

	318 // The ordering of elements may differ due to denormalized values sorting

	319 // differently in the DER encoding. Since the number of elements should be

	320 // small, a naive linear search for each element should be fine.
	Ryan Sleevi 2015/07/17 21:28:59 Since it tripped up Eric, maybe it's worth expandi Since it tripped up Eric, maybe it's worth expanding a comment here about the 'hostile' case and why it's a non-issue (I still think it's a non issue) mattm 2015/07/21 00:17:47 Done. Show quoted text On 2015/07/17 21:28:59, Ryan Sleevi wrote: > Since it tripped up Eric, maybe it's worth expanding a comment here about the > 'hostile' case and why it's a non-issue (I still think it's a non issue) Done.
	321 for (const auto& a_type_and_value : a_type_and_values) {

	322 bool matched = false;

	323 for (const auto& b_type_and_value : b_type_and_values) {

	324 if (base::get<0>(a_type_and_value)

	325 .Equals(base::get<0>(b_type_and_value)) &&

	326 VerifyValueMatch(

	327 base::get<1>(a_type_and_value), base::get<2>(a_type_and_value),

	328 base::get<1>(b_type_and_value), base::get<2>(b_type_and_value))) {

	329 matched = true;

	330 break;

	331 }

	332 }

	333 if (!matched)

	334 return false;

	335 }

	336

	337 // Every element in \|a_type_and_values\| had a matching element in

	338 // \|b_type_and_values\|.

	339 return true;

	340 }

	341

	342 } // namespace

	343

10 bool VerifyNameMatch(const der::Input& a, const der::Input& b) {	344 bool VerifyNameMatch(const der::Input& a, const der::Input& b) {

11 // TODO(mattm): use normalization as specified in RFC 5280 section 7.	345 der::Parser a_parser(a);

12 return a.Equals(b);	346 der::Parser b_parser(b);

	347 der::Parser a_rdn_sequence;

	348 der::Parser b_rdn_sequence;

	349

	350 if (!a_parser.ReadSequence(&a_rdn_sequence) \|\|

	351 !b_parser.ReadSequence(&b_rdn_sequence)) {

	352 return false;

	353 }

	354

	355 // No data should remain in the inputs after the RDN sequence.

	356 if (a_parser.HasMore() \|\| b_parser.HasMore())

	357 return false;

	358

	359 // Must have at least one RDN.
	Ryan Sleevi 2015/07/17 21:29:00 I think for this, we do want to substantially expa I think for this, we do want to substantially expand on the comments about the data structures in play here. For example, see Eric's comment style in https://codereview.chromium.org/1218753002/diff/470001/net/cert/internal/sign... , which more thoroughly details things. I guess here, this is a tricky thing because of Section 4.1.2.6 of RFC 5280, compared with Section 4.1.2.4. That is, when matching a Subject to an Issuer name, this requirement holds - it MUST be a non-empty sequence. However, when matching a Subject to a Subject, if the cert itself is not a CA cert, it's perfectly valid to have an empty DN and a critical SAN (and in fact, at least one major wireless vendor does exactly this for their captive portal) So I don't think we want to do this check here in the name matching, and it may need to be surfaced up to when we do Subject/Issuer comparisons, versus name-equality comparisons. mattm 2015/07/21 00:17:47 done. Show quoted text On 2015/07/17 21:29:00, Ryan Sleevi wrote: > I think for this, we do want to substantially expand on the comments about the > data structures in play here. > > For example, see Eric's comment style in > https://codereview.chromium.org/1218753002/diff/470001/net/cert/internal/sign... > , which more thoroughly details things. done. Show quoted text > I guess here, this is a tricky thing because of Section 4.1.2.6 of RFC 5280, > compared with Section 4.1.2.4. > > That is, when matching a Subject to an Issuer name, this requirement holds - it > MUST be a non-empty sequence. However, when matching a Subject to a Subject, if > the cert itself is not a CA cert, it's perfectly valid to have an empty DN and a > critical SAN (and in fact, at least one major wireless vendor does exactly this > for their captive portal) > > So I don't think we want to do this check here in the name matching, and it may > need to be surfaced up to when we do Subject/Issuer comparisons, versus > name-equality comparisons. removed.
	360 if (!a_rdn_sequence.HasMore() \|\| !b_rdn_sequence.HasMore())

	361 return false;

	362

	363 while (a_rdn_sequence.HasMore() && b_rdn_sequence.HasMore()) {

	364 der::Parser a_rdn, b_rdn;

	365 if (!a_rdn_sequence.ReadConstructed(der::kSet, &a_rdn) \|\|

	366 !b_rdn_sequence.ReadConstructed(der::kSet, &b_rdn)) {

	367 return false;

	368 }

	369 if (!VerifyRdnMatch(&a_rdn, &b_rdn))

	370 return false;

	371 }

	372

	373 // If one of the sequences has more elements than the other, not a match.

	374 if (a_rdn_sequence.HasMore() \|\| b_rdn_sequence.HasMore())

	375 return false;

	376

	377 return true;

13 }	378 }

14	379

15 } // namespace net	380 } // namespace net

OLD	NEW

« no previous file with comments | « no previous file | net/cert/internal/verify_name_match_unittest.cc » ('j') | net/der/input.h » ('J')