The bits must flow...

Validating Standard Numbers

Background

Validating identification data is always tricky. The sheer amount of different national (and international) identification numbers could be overwhelming for developers to come up with a right solution. In this article, I’m going to share some approaches, and thoughts on validating some of these crucial identification data: IBAN, BIC, and Social Security Numbers.

Customers rely on Fountain’s Data Collection stage to collect employment-critical - and often sensitive - applicant data. The Data Collection stage consists of a series of questions that are relevant to making a hiring decision. Social Security Number (ssn) and Driver License Number (driver_license_number) are examples of Standard Questions asked by a lot of Fountain customers. On top of these Standard Questions, Fountain customers could also add Custom Questions to collect customer specific data: T-Shirt Size (t_shirt_size) for example.

To ensure the validity of applicant data, Fountain already offers predefined validation logic for some of these standard questions. For custom questions, customers could define Regex validation for each questions.

Pain Points

  • IBAN (International Bank Account Number), and BIC (Business Identifier Code) were missing from Standard Questions list.
    • IBAN and BIC are among the top frequently asked questions by Fountain’s European customers to collect applicant’s bank data.
    • IBAN - which consists of up to 34 alphanumeric characters - are tedious to type in, and easy for applicants to input incorrectly
    • Some customers added their own regex validation, but regex isn’t sufficient to check the validity of IBAN and BIC numbers.
  • SSN (Social Security Number) validation
    • Fountain software validation was for U.S SSN only. Multiple countries have their own version of social security numbers, so validations for these numbers were missing.
      • Sozialversicherungsnummer (Austria & Germany)
      • Burgerservicenummer (Netherlands) … and the list goes on

Solution

For validating standard numbers across different countries, we looked for existing open source projects or external API services to integrate with. Below are links to useful service and open source project we adopted.

  • iban.com: Provides an api for validating IBAN and BIC. On top of validation, Validation API enabled us to extract BIC from IBAN, reducing number of questions that applicants have to answer. Considering number of applications that Fountain processes, €3800 / year for unlimited api calls seemed reasonable.
  • stdnum-js: JS open source library for validating national numbers. The package already offered validation for numerous national numbers, but it was missing validation for some numbers we need like German SVNR, or English NINO. I’ll cover more details in the next section.

Working with stdnum-js (open source)

Building in-house software solution is always relieving - it just takes more resources 😂. At Fountain, we’re leveraging a lot of external services and APIs to offer our customers a comprehensive hiring solution. Some examples are Docusign and Hellosign for document signature, Whatsapp and Twilio integration for messaging, Checkr, Accurate, and Sterling integrations for background checks.

So, why not for “simple” data validations? Below are a few things we considered beforehand.

  • Amount of Countries and amount of Standard numbers we want to support validations for.
  • Reliable, Scalable, and Maintainable 3rd party solution.
  • Fountain Engineering resource (# engineers, # hours).

stdnum-js seemed to meet our standards pretty well - 1) Good unit test coverage with reference to validation logic source, 2) Great maintainer w/ quick response (shout out to David Koblas 👋), 3) Frequent updates with validations for new standard numbers.

Here’s our first PR to the package for German SVNR / RVNR (German social security / pension number). We got our first review in 2 days, and merged the PR in about a week. When working with an open source project, it’s always good to start with small updates, get instant feedback, and check the feasibility of the changes for production use with the maintainers.

Code Example

A lot of standard numbers have Checksum validation using check digit (1 or 2 characters in the number). Let’s take German SVNR / RVNR as an example. This wikipedia article summarizes validation logic for the number.

Sample number: 15 070649 C103

  • Digit 1-2: Area number of the pension insurance institution (15)
  • Digit 3-4: birthday of the insured (07)
  • Digit 5-6: Month of birth of the insured (06)
  • Digit 7-8: Year of birth of the insured (49)
  • Digit 9: First letter of the insured person’s maiden name ©
  • Digit 10-11: Serial number (00-49 = male, 50-99 = female or diverse or gender entry left open)
  • Digit 12: Check digit (3)

For German SVNR, Checksum is calculated by multiplying weights (2, 1, 2, 5, 7, 1, 2, 1, 2, 1, 2 and 1) to each digit in order and adding them together. For validation, we check if Checksum modulo (%) 10 matches check digit (= 3 in this example).

/**
 * Compute the weighted sum of a string
 * @param {boolean} sumByDigit - ex) if checksum entry is 18, add 9 (sum of digits) to the sum (instead of 18).
 */
export function weightedSum(
  value: string,
  {
    alphabet = '0123456789',
    reverse = false,
    weights = [1],
    modulus = 0,
    sumByDigit = false,
  }: {
    alphabet?: string;
    reverse?: boolean;
    modulus: number;
    weights?: number[];
    sumByDigit?: boolean;
  },
): number {
  const wlen = weights.length;
    export function weightedSum(
    while (vv < 0) {
      vv += modulus;
    }

    if(sumByDigit && vv > 9) {
      return (acc + (sumAllDigits(vv))) % modulus;
    }

    return (acc + vv) % modulus;
  }, 0);
}

...

const [frontWithAlpha, check] = strings.splitAt(value, 11);
const front = frontWithAlpha.split('').map(c => checkAlphabetDict[c] ?? c).join('')
const sum = weightedSum(front, {
      weights: [2, 1, 2, 5, 7, 1, 2, 1, 2, 1, 2, 1],
      modulus: 10,
      sumByDigit: true,
    });

Checksum validation logic varies across different standard numbers. Here are standard numbers and abbreviations supported by stdnum-js.

Wrap up

Validating standard numbers is tricky - it’s less of a technical challenge, but it takes time and effort to find resources to refer to, and maintain/expand existing solutions. For a lot of use cases, finding a good open source project and iterating fast could be a good starting point.

Acknowledgement

  • Shout out to Mike Ryan (ex-Fountaineer) for major contributions in improving the library, and building the feature.