CodeWars: Not Very Secure

The Problem Statement

In this example you have to validate if a user input string is alphanumeric. The given string is not nil/null/NULL/None, so you don't have to check that.

The string has the following conditions to be alphanumeric:

  • At least one character ("" is not valid)
  • Allowed characters are uppercase / lowercase Latin letters and digits from 0 to 9
  • No whitespaces / underscore

The Function Signature:

def alphanumeric(password):
    pass

Testing

First, some imports.

# python
from string import ascii_letters, digits
from typing import Callable

import random
import re

# pypy
from expects import expect, equal

Test Cases

These are the cases give by CodeWars - most of them are randomly generated.

TEST_CASES = [
    "hello world_",
    "PassW0rd",
    "     ",
    "__ * __",
    "&)))(((",
    "43534h56jmTHHF3k",
    "",
    "nFmRaqZ9BUupkhe99Og4Bn1uLbsBG5",
    "b5rN4",
    "6sWJaF3Z9A570RWR2WuqyC",
    "IAZN0rQMet0CD06Tf(hTO",
    "5s76LJ*8Cq5yn",
    ":mL",
    "2rvyam",
    "UbMbOsy6LKIUeUGoD9PuyaP3zv",
    "wztVBnTzqFEq5u0M7%n",
    """8DxJH4Segq
    lqQVlSA10xITOiMt""",
    "Ir7zdFO5keT91eA^iWkX90rVqe0X",
    "",
    "t87xDQOL5NzlQI1KH8M02Qo1axNSz",
    "6LzkewodgnnYwfe2Pen/a7ope7owboA",
    "XkxGwyr",
    "8Xj#GSX906q",
    "402QCsvUInL50XPOvcd",
    "jOX)EcfCbUVEKdBkcM0",
    "",
    "jaJeix4OIah39mErxSfLgYX7yI1",
    "^HNCy",
    "FEXI8JxrMQK2gFVGZnFNI",
    "EDcJsDD40=lqSBj3JA",
    "FcbS",
    "R*ykZ",
    "AjtPDT",
    "dkPXw31CWSYgiG<EOBoqTX",
    "cqVLp9fs98gXl",
    "|n37oUuwf5T",
    "tho3WBtKubXDwfQZ2Pd",
    "8iZc6X4Sk07H2Hcqt31.pwnGy",
    "G7zC\etue",
    "d",
    "V5cNfgK Hj4dEeJ",
    "36tNa5ip",
    "9ra(zcux",
    "N79VJr6L\Dud6jxebOC",
    "pk4a0UvRgTDyL",
    "rLeZPfihch|HSVmtsYp8IDP0nE",
    "sJrK0ME2KlOEWs8OwdUwrtc",
    "BZ5YoLxqeSGpn8x1vDJ9AW",
    "d:Yy3",
    "mLV&I3u3BEiahtE",
    "E4WwmoAAeWDD2v4cAH40E2eIoP",
    "KeS7PqVNv",
    "E953SS2^1BTlDrkVaMRZ2pxji3R",
    "7e7w0wn uH7GhK7",
    "v91Tlcgo8uOyoXGC5B2",
    "rHoMNejW62",
    "FvQ;zg4QjtLGincQbgtitRE0x59i",
    "atfRWKyYzX#h",
    "ryINv[nwOnD6ZA0awajbZG",
    "DRzm9wgmqR*tNfTH9ECVuGcp6L",
    "a02GE",
    "rHLrd8Eg8DtUkelRUR^yLJ0",
    "6d520vrKHIhe8G",
    "zqnL05D!WOfbnH",
    "uIzed|SFyhLpBf72mBKPLy8z",
    "MgMg8QzI77zkYMNyojkFEPIW",
    "mPe7m5L7t",
    "BG        AWuGtQCnHac2wb",
    "HPG3mi",
    "K hTcILzx1tU7",
    "5%uG0LrmjW3bSZR",
    "lf2S3sIKgFjzxGnnQCkO7YnARv",
    "PwCVGoB",
    "6lU]O",
    "Jze74",
    "DVf/usVMS62kn",
    "DXT5ASHQYaVYc9uZ3Jm3ZAw",
    "wsOFSPcV,rgba9A4baNNcC",
    "uZuRbdTv(N9xjXdNDUye",
    "0LSZSmekL0",
    "YrAcW1CDSjRwa",
    "z9aG3Q9w5R*bt",
    "fsBg8NnhhEkhTPz9XsgAyxo0a",
    "wfT!o6uYr",
    "LvON7",
    "YrgexSunN3DMDBtZ9MoXrnhyBonl",
    "KvnruYLI0TnXa7rnT",
    "XlhrT95PiBjzpm7xeuQXk",
    "c00UB1",
    "W51WK7BzeF",
    "v0r k",
    "4J]oT0qdNx",
    ">uLYV2YLRqSSvRE",
    "z0cP>O6W0xtEyZ2",
    "7Td7ZJpsMe6O7dmeA5XgQj",
    "zjQHfoqXfIGRsow",
    "qv",
    "2JnojjTEWNwOJ9)LzL2AY",
    'WEEZuO1"3ggmYRs8Sp2wUqdQXz0',
    "LF9NZbe7AYzFIF5IO",
    "V9SByoOlc0yUNqdV0",
    "pNn",
    "q2AUD5KJZ3bKMEDlqgrrLhzX6PtQ",
    "`cCTmmb0HOXR",
    "flQyxMWDidg",
    "8X'51m9UD",
    "VYDcIkrLDFA5cDz8mHGp;6x6RqU"]

The Tester

It turns out that, once again, as with the binary conversion case from edabit, there is a python built-in that solves this problem (isalnum). From the python documentation:

Return True if all characters in the string are alphanumeric and there is at least one character, False otherwise. A character c is alphanumeric if one of the following returns True: c.isalpha(), c.isdecimal(), c.isdigit(), or c.isnumeric().

Curiously, there is an isdecimal method and an isdigit method. I read the documentation for them and it appears that isdigit actually encompasses more than the 10 digits of the base-10 system, including something called the Kharosthi Numbers so this function is too permissive, but the test cases they gave don't seem to have any exotic characters so I'm going to assume that it will work as a validator for this problem.

def tester(testee: Callable[[str], bool], cases: list=TEST_CASES,
           throw: bool=True) -> None:
    """Run the testee over the test-cases

    Args:

     - `testee`: function to check if a string is alphanumeric
     - `cases`: iterable test-cases to check
     - `throw`: Throw a exception if a case fails (otherwise just print failure)

    Raises:

     AssertionError if any case fails
    """
    for test_case in cases:
        try:
            expect(test_case.isalnum()).to(equal(testee(test_case)))
        except AssertionError as error:
            print("Failed Test Case: '{0}' Expected: {1} Actual: {2}".format(
                test_case, test_case.isalnum(), testee(test_case)))
            if throw:
                raise
    return

A Solution

The problem-page on CodeWars shows a "RegularExpressions" tag so I'm going to assume that that's the way to solve it. My first thought was to use the \w special character, but the documentation says:

Matches Unicode word characters; this includes alphanumeric characters (as defined by str.isalnum()) as well as the underscore (_). If the ASCII flag is used, only [a-zA-Z0-9_] is matched.

The description says that it's equivalent to [a-zA-Z0-9_], so we can't use it (we don't want underscores), but if we use the same character ranges and leave out the underscore, it should work.

ALPHANUMERIC = "[a-zA-Z0-9]"
ONE_OR_MORE = "+"

PATTERN_ONE = re.compile(ALPHANUMERIC + ONE_OR_MORE)

def submitted(password: str) -> bool:
    """Check if the input is alphanumeric

    Args:

     - password: string to check

    Returns:

      True if alphanumeric False otherwise
    """
    return PATTERN_ONE.fullmatch(password) is not None

tester(submitted)

As a check, I'll see what happens if I used the \w instead.

WITH_UNDERSCORES = re.compile("\w" + ONE_OR_MORE)

def allows_underscores(password: str) -> bool:
    """Checks if the password has only alphanumeric or underscore characters

    Args:

      - password: input to check

    Returns:

     - True if valid
    """
    return WITH_UNDERSCORES.fullmatch(password) is not None

JUST_UNDERSCORES = "____"
print(JUST_UNDERSCORES.isalnum())
print(allows_underscores(JUST_UNDERSCORES))

tester(allows_underscores)
False
True

So, that was a surprise. Why did allows_underscores pass the tests? If you look back at the test-cases you'll see that none of them have just letters, digits, and underscores, if there's an underscore then there's also white-space or some other invalid character. Seems like there's a hole in their testing.

Let's add in a couple of cases that should fail.

EXTRA = "".join(random.choices(ascii_letters + digits + "_", k=25)) + "_"
TEST_CASES_2 = TEST_CASES + [EXTRA, JUST_UNDERSCORES]

tester(allows_underscores, TEST_CASES_2, throw=False)

That's better.

The End

So, another not so exciting problem, but I did learn that there's a fullmatch function now, spurring me to look up what the match and search methods do again, which was useful. As a note to my future self, match and fullmatch are essentially shortcuts so you don't have to use the beginning and ending of line characters. That is to say, search("^[a-z]+") is the equivalent of match("[a-z]+") and search("^[a-z]+$") is the equivalent of fullmatch("[a-z]+")". There might be other differences, but for simple cases like this that'll do.

Links

Not-Very-Secure: The CodeWars problem page.

StackOverflow: Answer explaining the difference between match, search, and fullmatch.

Python's Regular Expressions: The documentation page.

str.isalnum: python's built-in types page with the string methods.

Edabit: Convert a Number to Base-2

The Problem Statement (from Edabit)

This is the first time I took a look at edabit, their problems seem simpler than LeetCode, although I haven't gone too deep into it. Here's there description of the problem.

Create a function that returns a base-2 (binary) representation of a base-10 (decimal) string number. To convert is simple: ((2) means base-2 and (10) means base-10) 010101001(2) = 1 + 8 + 32 + 128.

Going from right to left, the value of the most right bit is 1, now from that every bit to the left will be x2 the value, value of an 8 bit binary numbers are (256, 128, 64, 32, 16, 8, 4, 2, 1).

Examples:

binary(1)  "1"
binary(5)  "101"
binary(10)  "1010"

Notes

  • Numbers will always be below 1024 (not including 1024).
  • The strings will always go to the length at which the most left bit's value gets bigger than the number in decimal.
  • If a binary conversion for 0 is attempted, return "0".

A Tester

Now that I have the problem statement I can create a test function for it. First, I'll do some importing.

# python
import math
import random

from typing import Callable

# from pypi
from expects import equal, expect

The Built-In Solution

It turns out that python has a built-in function (bin) that solves the problem. They don't explicitly say not to use it, but I'll assume that it isn't what's intended to be used as a solution and use it instead to test any solution I create.

def built_in(integer: int) -> str:
    """Convert an integer to a string of bits

    Args:

     - `integer`: the base-10 number to convert

    Returns:

     bit-string equivalent of the input
    """
    return bin(integer).lstrip("-0b")

print(built_in(5))
101

The Tester

And now a test function that will compare the output of a function passed in to it to the built_in function's output.

def tester(testee: Callable[[int], str], value: int) -> None:
    """Tests an integer to binary function's output

    Args:

     - `testee`: function that converts an integer to a bit-string
     - `value`: integer to test

    Raises:
      AssertionError if bit-string is incorrect.
    """
    if value == 0:
        expect(testee(value)).to(equal("0"))
    else:
        expect(testee(value)).to(equal(built_in(value)))
    return

The Tester Tester

After running the tester a few times I decided to add a runner around it to add a random number to the tests cases.

def tester_tester(testee: Callable[[int], str], verbose: bool=False) -> None:
    """Runs the testee over the given examples and a random number

    Args:

     - `testee`: function to convert integer to binary string
     - `verbose`: output the test-cases if true

    Raises:

      AssertionError if a case is fails
    """
    random_case = random.randrange(2, 1024)
    for test_case in ((0, 1, 5, 10, random_case)):
        try:
            tester(solution_one, test_case)
        except ValueError as error:
            print("error test-case: {}".format(test_case))
            break
        if verbose:
            print("decimal: {} binary: {}".format(test_case,
                                                  solution_one(test_case)))
    return

I can't really remember the circumstances anymore, but I think I'm trapping the ValueError because that's raised of the input to the function can't be converted to a binary string. This was probably the 0 test-case.

The First Python Solution

There's probably an even more brute-force way to do this, but the first thing that came into my mind was to use the \(\log_2\) function to find out what exponent of 2 is needed to create the input (without going over) so I'd know how many bits I'd need (this actually gives one less than what you need, since we need an extra bit for the \(2^0\) digit). After finding the exponent the function will then traverses a range in backwards order \(\{n, n-1, \ldots, 1, 0\}\), using the values as the exponent for the next power of 2 to test. If the power is less than the amount remaining to convert I add a 1 to the bit-list and subtract the power from the remainder and if not I add a 0 to the list and move to the next exponent to check.

def solution_one(value: int) -> str:
    """Convert the integer to a binary string

    Args:
     - value: base-10 integer to convert

    Returns:
     - binary string version of the input
    """
    ZERO, ONE = "0", "1"
    if value == 0:
        return ZERO

    digits = math.floor(math.log2(value)) + 1
    remainder = value
    bits = []

    for exponent in reversed(range(digits)):
        if 2**exponent <= remainder:
            bits.append(ONE)
            remainder -= 2**exponent
        else:
            bits.append(ZERO)
    return "".join(bits)

tester_tester(solution_one, verbose=True)
decimal: 0 binary: 0
decimal: 1 binary: 1
decimal: 5 binary: 101
decimal: 10 binary: 1010
decimal: 242 binary: 11110010

A Python 3 Solution

While I was checking out python's Power and Logarithmic functions I noticed that there was a helpful box noting that python's integers have a built in method called bit_length that will tell you how many bits you need to represent the integer. This should give the the same result as what I did with floor and log2 but since they went to all the trouble of adding it as a method it seemed like it'd be a shame not to use it.

def solution_three(value: int) -> str:
    """Convert the integer to a binary string

    Args:
     - value: base-10 integer to convert

    Returns:
     - binary string version of the input
    """
    ZERO, ONE = "0", "1"
    digits = value.bit_length()
    bits = []
    remainder = value

    for exponent in reversed(range(digits)):
        if 2**exponent <= remainder:
            bits.append(ONE)
            remainder -= 2**exponent
        else:
            bits.append(ZERO)
    return "".join(bits) if bits else ZERO

tester_tester(solution_three, verbose=True)
decimal: 0 binary: 0
decimal: 1 binary: 1
decimal: 5 binary: 101
decimal: 10 binary: 1010
decimal: 926 binary: 1110011110

The End

So, there you go. I wasn't expecting to learn anything from this problem, but I didn't know the bit_length method exists so I guess it just goes to show that there's always something more to learn. Not that I can think of a use for it…

Links

Edabit

Convert a number to Base-2 : The Edabit problem page.

Python

math.log2: Python's documentation for the log2 function (where it says it's better than using the regular log function and passing in a base of 2)