CodeWars: Not Very Secure
Table of Contents
The Problem Statement
In this example you have to validate if a user input string is alphanumeric. The given string is not
nil/null/NULL/None
, so you don't have to check that.The string has the following conditions to be alphanumeric:
- At least one character ("" is not valid)
- Allowed characters are uppercase / lowercase Latin letters and digits from 0 to 9
- No whitespaces / underscore
The Function Signature:
def alphanumeric(password):
pass
Testing
First, some imports.
# python
from string import ascii_letters, digits
from typing import Callable
import random
import re
# pypy
from expects import expect, equal
Test Cases
These are the cases give by CodeWars - most of them are randomly generated.
TEST_CASES = [
"hello world_",
"PassW0rd",
" ",
"__ * __",
"&)))(((",
"43534h56jmTHHF3k",
"",
"nFmRaqZ9BUupkhe99Og4Bn1uLbsBG5",
"b5rN4",
"6sWJaF3Z9A570RWR2WuqyC",
"IAZN0rQMet0CD06Tf(hTO",
"5s76LJ*8Cq5yn",
":mL",
"2rvyam",
"UbMbOsy6LKIUeUGoD9PuyaP3zv",
"wztVBnTzqFEq5u0M7%n",
"""8DxJH4Segq
lqQVlSA10xITOiMt""",
"Ir7zdFO5keT91eA^iWkX90rVqe0X",
"",
"t87xDQOL5NzlQI1KH8M02Qo1axNSz",
"6LzkewodgnnYwfe2Pen/a7ope7owboA",
"XkxGwyr",
"8Xj#GSX906q",
"402QCsvUInL50XPOvcd",
"jOX)EcfCbUVEKdBkcM0",
"",
"jaJeix4OIah39mErxSfLgYX7yI1",
"^HNCy",
"FEXI8JxrMQK2gFVGZnFNI",
"EDcJsDD40=lqSBj3JA",
"FcbS",
"R*ykZ",
"AjtPDT",
"dkPXw31CWSYgiG<EOBoqTX",
"cqVLp9fs98gXl",
"|n37oUuwf5T",
"tho3WBtKubXDwfQZ2Pd",
"8iZc6X4Sk07H2Hcqt31.pwnGy",
"G7zC\etue",
"d",
"V5cNfgK Hj4dEeJ",
"36tNa5ip",
"9ra(zcux",
"N79VJr6L\Dud6jxebOC",
"pk4a0UvRgTDyL",
"rLeZPfihch|HSVmtsYp8IDP0nE",
"sJrK0ME2KlOEWs8OwdUwrtc",
"BZ5YoLxqeSGpn8x1vDJ9AW",
"d:Yy3",
"mLV&I3u3BEiahtE",
"E4WwmoAAeWDD2v4cAH40E2eIoP",
"KeS7PqVNv",
"E953SS2^1BTlDrkVaMRZ2pxji3R",
"7e7w0wn uH7GhK7",
"v91Tlcgo8uOyoXGC5B2",
"rHoMNejW62",
"FvQ;zg4QjtLGincQbgtitRE0x59i",
"atfRWKyYzX#h",
"ryINv[nwOnD6ZA0awajbZG",
"DRzm9wgmqR*tNfTH9ECVuGcp6L",
"a02GE",
"rHLrd8Eg8DtUkelRUR^yLJ0",
"6d520vrKHIhe8G",
"zqnL05D!WOfbnH",
"uIzed|SFyhLpBf72mBKPLy8z",
"MgMg8QzI77zkYMNyojkFEPIW",
"mPe7m5L7t",
"BG AWuGtQCnHac2wb",
"HPG3mi",
"K hTcILzx1tU7",
"5%uG0LrmjW3bSZR",
"lf2S3sIKgFjzxGnnQCkO7YnARv",
"PwCVGoB",
"6lU]O",
"Jze74",
"DVf/usVMS62kn",
"DXT5ASHQYaVYc9uZ3Jm3ZAw",
"wsOFSPcV,rgba9A4baNNcC",
"uZuRbdTv(N9xjXdNDUye",
"0LSZSmekL0",
"YrAcW1CDSjRwa",
"z9aG3Q9w5R*bt",
"fsBg8NnhhEkhTPz9XsgAyxo0a",
"wfT!o6uYr",
"LvON7",
"YrgexSunN3DMDBtZ9MoXrnhyBonl",
"KvnruYLI0TnXa7rnT",
"XlhrT95PiBjzpm7xeuQXk",
"c00UB1",
"W51WK7BzeF",
"v0r k",
"4J]oT0qdNx",
">uLYV2YLRqSSvRE",
"z0cP>O6W0xtEyZ2",
"7Td7ZJpsMe6O7dmeA5XgQj",
"zjQHfoqXfIGRsow",
"qv",
"2JnojjTEWNwOJ9)LzL2AY",
'WEEZuO1"3ggmYRs8Sp2wUqdQXz0',
"LF9NZbe7AYzFIF5IO",
"V9SByoOlc0yUNqdV0",
"pNn",
"q2AUD5KJZ3bKMEDlqgrrLhzX6PtQ",
"`cCTmmb0HOXR",
"flQyxMWDidg",
"8X'51m9UD",
"VYDcIkrLDFA5cDz8mHGp;6x6RqU"]
The Tester
It turns out that, once again, as with the binary conversion case from edabit, there is a python built-in that solves this problem (isalnum). From the python documentation:
Return
True
if all characters in the string are alphanumeric and there is at least one character,False
otherwise. A character c is alphanumeric if one of the following returnsTrue
:c.isalpha()
,c.isdecimal()
,c.isdigit()
, orc.isnumeric()
.
Curiously, there is an isdecimal
method and an isdigit
method. I read the documentation for them and it appears that isdigit
actually encompasses more than the 10 digits of the base-10 system, including something called the Kharosthi Numbers so this function is too permissive, but the test cases they gave don't seem to have any exotic characters so I'm going to assume that it will work as a validator for this problem.
def tester(testee: Callable[[str], bool], cases: list=TEST_CASES,
throw: bool=True) -> None:
"""Run the testee over the test-cases
Args:
- `testee`: function to check if a string is alphanumeric
- `cases`: iterable test-cases to check
- `throw`: Throw a exception if a case fails (otherwise just print failure)
Raises:
AssertionError if any case fails
"""
for test_case in cases:
try:
expect(test_case.isalnum()).to(equal(testee(test_case)))
except AssertionError as error:
print("Failed Test Case: '{0}' Expected: {1} Actual: {2}".format(
test_case, test_case.isalnum(), testee(test_case)))
if throw:
raise
return
A Solution
The problem-page on CodeWars shows a "RegularExpressions" tag so I'm going to assume that that's the way to solve it. My first thought was to use the \w
special character, but the documentation says:
Matches Unicode word characters; this includes alphanumeric characters (as defined by
str.isalnum()
) as well as the underscore (_
). If the ASCII flag is used, only[a-zA-Z0-9_]
is matched.
The description says that it's equivalent to [a-zA-Z0-9_]
, so we can't use it (we don't want underscores), but if we use the same character ranges and leave out the underscore, it should work.
ALPHANUMERIC = "[a-zA-Z0-9]"
ONE_OR_MORE = "+"
PATTERN_ONE = re.compile(ALPHANUMERIC + ONE_OR_MORE)
def submitted(password: str) -> bool:
"""Check if the input is alphanumeric
Args:
- password: string to check
Returns:
True if alphanumeric False otherwise
"""
return PATTERN_ONE.fullmatch(password) is not None
tester(submitted)
As a check, I'll see what happens if I used the \w
instead.
WITH_UNDERSCORES = re.compile("\w" + ONE_OR_MORE)
def allows_underscores(password: str) -> bool:
"""Checks if the password has only alphanumeric or underscore characters
Args:
- password: input to check
Returns:
- True if valid
"""
return WITH_UNDERSCORES.fullmatch(password) is not None
JUST_UNDERSCORES = "____"
print(JUST_UNDERSCORES.isalnum())
print(allows_underscores(JUST_UNDERSCORES))
tester(allows_underscores)
False True
So, that was a surprise. Why did allows_underscores
pass the tests? If you look back at the test-cases you'll see that none of them have just letters, digits, and underscores, if there's an underscore then there's also white-space or some other invalid character. Seems like there's a hole in their testing.
Let's add in a couple of cases that should fail.
EXTRA = "".join(random.choices(ascii_letters + digits + "_", k=25)) + "_"
TEST_CASES_2 = TEST_CASES + [EXTRA, JUST_UNDERSCORES]
tester(allows_underscores, TEST_CASES_2, throw=False)
That's better.
The End
So, another not so exciting problem, but I did learn that there's a fullmatch
function now, spurring me to look up what the match
and search
methods do again, which was useful. As a note to my future self, match
and fullmatch
are essentially shortcuts so you don't have to use the beginning and ending of line characters. That is to say, search("^[a-z]+")
is the equivalent of match("[a-z]+")
and search("^[a-z]+$")
is the equivalent of fullmatch("[a-z]+")"
. There might be other differences, but for simple cases like this that'll do.