Splitting a numeric code in Python

Last updated: 2025-04-15

Drew Leske

Sometimes people do funky things with data. We have a use case where two two-digit codes are mashed together into a four-digit code, and we need to split them. These codes are stored as integers because, well, why not.

Perhaps the more obvious solution is to split it as a string as the first two digits and then the last two digits. Of course, the integer needs to be converted into a string first. Then we convert the parts back into integers. For example:

def strsplit(code):
    codestr = str(code)
    c1 = int(codestr[:2])
    c2 = int(codestr[2:])
    return (c1, c2)

I thought all this conversion might be a bit laboured. Another way is to leave it as an integer and use integer arithmetic:

def intsplit(code):
    c1 = int(code / 100)
    c2 = code - (c1 * 100)
    return (c1, c2)

Pretty simple and has the same results for all codes from 1000 to 9999:

for code in range(1000, 9999):
    r1 = strsplit(code)
    r2 = intsplit(code)
    assert r1 == r2

How do they compare in terms of time though? The integer version will be faster because converting an integer into a string, searching and splitting the string, etc. involve memory manipulation, integer comparisons, and integer arithmetic. The integer version only involves arithmetic and most of that is integer arithmetic, which is probably cheaper, though I have no idea about Python implementations. The int(code/100) introduces floating point arithmetic, though I don’t want it to, and both code and the literal divisor are integers, but oh well.

I use the builtin Python timeit library to see how the performance of the two methods compare:

# build a 100-element array of random codes
rands = [ random.randint(1000,9999) for r in range(1, 100) ]

print("Time to split 100 codes using intsplit:")
print(timeit.timeit('[intsplit(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using strsplit:")
print(timeit.timeit('[strsplit(code) for code in rands]', globals=globals()))

The results are pretty consistent across several runs:

Time to split 100 codes using intsplit:
7.765972374996636
Time to split 100 codes using strsplit:
22.87222112499876

Along the way, it occurred to me perhaps round() is a better function to use than int()–I don’t want a whole class conversion. Except it retrospect this will involve a mathematical operation as well as a class conversion, and um round() isn’t trunc() and must actually be more expensive since… so, yeah, facepalm.

I also thought it was odd that Python doesn’t natively have an integer division operator. Turns out it does: //.

So here are a couple of extra test results:

Time to split 100 codes using intsplitround:
12.031848040991463
Time to split 100 codes using trueintsplit:
6.611741250031628
Time to split 100 codes using truncintsplit:
8.231211709033232

These all vary on the first operation with the division. Respectively, they use:

  • c1 = round(code / 100) (and yeah, this one is silly)
  • c1 = code // 100 (integer division!)
  • c1 = math.trunc(code / 100) (actual truncation)

Lee asked why I didn’t use modulus. (Because I forgot.) That provides another improvement:

Time to split 100 codes using intmodsplit:
6.056473457952961

This is with c2 = code % 100 to isolate the second two-digit chunk.

The winner

So the winner is the “true” integer split, and the only cost is when you come across that operator for the first time–the cost of learning something new. (Presumably after wondering if it’s a typo, maybe after the first retinal hit suggests to you it’s a JavaScript comment.) Using the modulus operator rather than multiplication and subtraction gives another slight gain. It also decouples isolation of the second chunk from that of the first chunk, which is of no great consequence here but is always nice.

So here is the final function:

def intmodsplit(code):
  """ Split a four-digit code into two two-digit codes.
  """
    c1 = code // 100
    c2 = code % 100
    return (c1, c2)

Some subtleties worth mentioning

Lee pointed out that code // 100 is a floor operation, while int(code / 100) is a truncation. For our use case where valid values of code are in the range 1000-9999–four-digit positive integers–these are equivalent. But not for actual math:

>>> 1534 // 100
15
>>> -1534 // 100
-16
>>> int(-1534 / 100)
-15
>>> int(1534 / 100)
15

Full test code and output

import timeit
import random
import math

def strsplit(code):
    codestr = str(code)
    c1 = int(codestr[:2])
    c2 = int(codestr[2:])
    return (c1, c2)

def intsplit(code):
    c1 = int(code/100)
    c2 = code - (c1*100)
    return (c1, c2)

def intsplitround(code):
    c1 = round(code/100)
    c2 = code - (c1*100)
    return (c1, c2)

def trueintsplit(code):
    c1 = code // 100
    c2 = code - (c1*100)
    return (c1, c2)

def truncintsplit(code):
    c1 = math.trunc(code/100)
    c2 = code - (c1*100)
    return (c1, c2)

def intmodsplit(code):
    c1 = code // 100
    c2 = code % 100
    return(c1, c2)

rands = [ random.randint(1000,9999) for r in range(1, 100) ]

print("Time to split 100 codes using intsplit:")
print(timeit.timeit('[intsplit(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using strsplit:")
print(timeit.timeit('[strsplit(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using intsplitround:")
print(timeit.timeit('[intsplitround(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using trueintsplit:")
print(timeit.timeit('[trueintsplit(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using truncintsplit:")
print(timeit.timeit('[truncintsplit(code) for code in rands]', globals=globals()))

print("Time to split 100 codes using intmodsplit:")
print(timeit.timeit('[intmodsplit(code) for code in rands]', globals=globals()))

Here’s the output of the latest run:

Time to split 100 codes using intsplit:
7.8108298749430105
Time to split 100 codes using strsplit:
22.905359833035618
Time to split 100 codes using intsplitround:
11.935492750024423
Time to split 100 codes using trueintsplit:
6.640683250036091
Time to split 100 codes using truncintsplit:
8.259806292015128
Time to split 100 codes using intmodsplit:
6.056473457952961