Splitting a numeric code in Python
Last updated: 2025-04-15
Drew LeskeSometimes people do funky things with data. We have a use case where two two-digit codes are mashed together into a four-digit code, and we need to split them. These codes are stored as integers because, well, why not.
Perhaps the more obvious solution is to split it as a string as the first two digits and then the last two digits. Of course, the integer needs to be converted into a string first. Then we convert the parts back into integers. For example:
def strsplit(code):
codestr = str(code)
c1 = int(codestr[:2])
c2 = int(codestr[2:])
return (c1, c2)
I thought all this conversion might be a bit laboured. Another way is to leave it as an integer and use integer arithmetic:
def intsplit(code):
c1 = int(code / 100)
c2 = code - (c1 * 100)
return (c1, c2)
Pretty simple and has the same results for all codes from 1000 to 9999:
for code in range(1000, 9999):
r1 = strsplit(code)
r2 = intsplit(code)
assert r1 == r2
How do they compare in terms of time though? The integer version will be
faster because converting an integer into a string, searching and splitting
the string, etc. involve memory manipulation, integer comparisons, and integer
arithmetic. The integer version only involves arithmetic and most of that is
integer arithmetic, which is probably cheaper, though I have no idea about
Python implementations. The int(code/100)
introduces floating point
arithmetic, though I don’t want it to, and both code
and the literal divisor
are integers, but oh well.
I use the builtin Python timeit library to see how the performance of the two methods compare:
# build a 100-element array of random codes
rands = [ random.randint(1000,9999) for r in range(1, 100) ]
print("Time to split 100 codes using intsplit:")
print(timeit.timeit('[intsplit(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using strsplit:")
print(timeit.timeit('[strsplit(code) for code in rands]', globals=globals()))
The results are pretty consistent across several runs:
Time to split 100 codes using intsplit:
7.765972374996636
Time to split 100 codes using strsplit:
22.87222112499876
Along the way, it occurred to me perhaps round()
is a better function to use
than int()
–I don’t want a whole class conversion. Except it retrospect
this will involve a mathematical operation as well as a class conversion, and
um round()
isn’t trunc()
and must actually be more expensive since… so,
yeah, facepalm.
I also thought it was odd that Python doesn’t natively have an integer
division operator. Turns out it does: //
.
So here are a couple of extra test results:
Time to split 100 codes using intsplitround:
12.031848040991463
Time to split 100 codes using trueintsplit:
6.611741250031628
Time to split 100 codes using truncintsplit:
8.231211709033232
These all vary on the first operation with the division. Respectively, they use:
c1 = round(code / 100)
(and yeah, this one is silly)c1 = code // 100
(integer division!)c1 = math.trunc(code / 100)
(actual truncation)
Lee asked why I didn’t use modulus. (Because I forgot.) That provides another improvement:
Time to split 100 codes using intmodsplit:
6.056473457952961
This is with c2 = code % 100
to isolate the second two-digit chunk.
The winner
So the winner is the “true” integer split, and the only cost is when you come across that operator for the first time–the cost of learning something new. (Presumably after wondering if it’s a typo, maybe after the first retinal hit suggests to you it’s a JavaScript comment.) Using the modulus operator rather than multiplication and subtraction gives another slight gain. It also decouples isolation of the second chunk from that of the first chunk, which is of no great consequence here but is always nice.
So here is the final function:
def intmodsplit(code):
""" Split a four-digit code into two two-digit codes.
"""
c1 = code // 100
c2 = code % 100
return (c1, c2)
Some subtleties worth mentioning
Lee pointed out that code // 100
is a floor operation, while int(code / 100)
is a truncation. For our use case where valid values of code
are in
the range 1000-9999–four-digit positive integers–these are equivalent. But
not for actual math:
>>> 1534 // 100
15
>>> -1534 // 100
-16
>>> int(-1534 / 100)
-15
>>> int(1534 / 100)
15
Full test code and output
import timeit
import random
import math
def strsplit(code):
codestr = str(code)
c1 = int(codestr[:2])
c2 = int(codestr[2:])
return (c1, c2)
def intsplit(code):
c1 = int(code/100)
c2 = code - (c1*100)
return (c1, c2)
def intsplitround(code):
c1 = round(code/100)
c2 = code - (c1*100)
return (c1, c2)
def trueintsplit(code):
c1 = code // 100
c2 = code - (c1*100)
return (c1, c2)
def truncintsplit(code):
c1 = math.trunc(code/100)
c2 = code - (c1*100)
return (c1, c2)
def intmodsplit(code):
c1 = code // 100
c2 = code % 100
return(c1, c2)
rands = [ random.randint(1000,9999) for r in range(1, 100) ]
print("Time to split 100 codes using intsplit:")
print(timeit.timeit('[intsplit(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using strsplit:")
print(timeit.timeit('[strsplit(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using intsplitround:")
print(timeit.timeit('[intsplitround(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using trueintsplit:")
print(timeit.timeit('[trueintsplit(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using truncintsplit:")
print(timeit.timeit('[truncintsplit(code) for code in rands]', globals=globals()))
print("Time to split 100 codes using intmodsplit:")
print(timeit.timeit('[intmodsplit(code) for code in rands]', globals=globals()))
Here’s the output of the latest run:
Time to split 100 codes using intsplit:
7.8108298749430105
Time to split 100 codes using strsplit:
22.905359833035618
Time to split 100 codes using intsplitround:
11.935492750024423
Time to split 100 codes using trueintsplit:
6.640683250036091
Time to split 100 codes using truncintsplit:
8.259806292015128
Time to split 100 codes using intmodsplit:
6.056473457952961