Given an input c
, outputs the number of distinct lists of strings lst
such that:
''.join(lst) == c
s
in lst
, s
consists of an arbitrary character followed by one or more characters from ‘0123456789’Sure hope I didn’t mess this up, because I think the fundamental idea is quite elegant! Should run successfully for all “reasonable” inputs (as in, the numeric output fits in a uint64 and the input isn’t ridiculously long). Fundamental algorithm is O(n) if you assume all arithmetic is O(1). (If not, then I don’t know what the time complexity is, and I don’t feel like working it out.)
from functools import cache
from itertools import pairwise
from math import prod
@cache
def fibonacci(n: int) -> int:
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
def main(compressed: str) -> int:
is_fragment_start = [i == 0 or c not in '0123456789' for i, c in enumerate(compressed)]
fragment_start_positions = [i for i, s in enumerate(is_fragment_start) if s]
fragment_lengths = [stop - start for start, stop in pairwise(fragment_start_positions + [len(compressed)])]
return prod(fibonacci(fragment_length - 1) for fragment_length in fragment_lengths)
if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('compressed')
print(main(parser.parse_args().compressed))
Idea: a00010 -> [a000, 10] -> [length 4, length 2] -> F(4) * F(2)
01a102b0305 -> [01, a102, b0305] -> [length 2, length 4, length 5] -> F(2) * F(4) * F(5)
where F(n) = fibonacci(n - 1) is the number of ways to partition a string of length n into a list of strings of length ≥2.
F(2) = 1 = fibonacci(1), F(3) = 1 = fibonacci(2), and F(n) = F(n - 2) + F(n - 1), so F is indeed just an offset version of the Fibonacci sequence.
To see why F(n) = F(n - 2) + F(n - 1), here are the ways to split up ‘abcde’: [‘ab’] + (split up ‘cde’), [‘abc’] + (split up ‘de’), and [‘abcde’], corresponding to F(5) = F(3) + F(2) + 1.
And the ways to split up ‘abcdef’: [‘ab’] + (split up ‘cdef’), [‘abc’] + (split up ‘def’), [‘abcd’] + (split up ‘ef’), and [‘abcdef’], corresponding to F(6) = F(4) + F(3) + F(2) + 1 = F(4) + F(5) = F(6 - 2) + F(6 - 1).
The same logic generalizes to all n >= 4.
So every list of strings, where each string is some character followed by one or more digits, is a distinct, valid decompressing option. Thanks for clarifying!
Thanks for the update on checking through solutions, and thanks in general for all the work you’ve put into this community!
Would just like to clarify: what are the valid decompressed strings? For an input of a333a3
, should we return 2 (either a333 a3
or a3 33 a3
) or 1 (since a333 a3
isn’t a possible compression – it would be a336
instead)? Do we have to handle cases like a00010
, and if so, how?
My solution (runs in O(n) time, but so do all the other solutions so far as far as I can tell):
from itertools import pairwise
def main(s: str) -> str:
characters = [None] + list(s) + [None]
transitions = []
for (_, left), (right_idx, right) in pairwise(enumerate(characters)):
if left != right:
transitions.append((right_idx, right))
repeats = [(stop - start, char) for (start, char), (stop, _) in pairwise(transitions)]
return ''.join(f'{char}{length}' for length, char in repeats)
if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('s')
print(main(parser.parse_args().s))
Runthrough:
'aaabb'
-> [None, 'a', 'a', 'a', 'b', 'b', None]
-> [(1, 'a'), (4, 'b'), (6, None)]
-> [(4 - 1, 'a'), (6 - 4, 'b')]
Golfed (just for fun, not a submission):
import sys
from itertools import pairwise as p
print(''.join(c+str(b-a)for(a,c),(b,_)in p([(i,r)for i,(l,r)in enumerate(p([None,*sys.argv[1],None]))if l!=r])))
I actually found this challenge to be easier than this week’s medium challenge. (Watch me say that and get this wrong while also getting the medium one correct…) Here’s an O(n) solution:
bracket_pairs = {('(', ')'), ('[', ']'), ('{', '}')}
def main(brackets: str) -> str:
n = len(brackets)
has_match_at = {i: False for i in range(-1, n + 1)}
acc = []
for i, bracket in enumerate(brackets):
acc.append((i, bracket))
if len(acc) >= 2:
opening_idx, opening = acc[-2]
closing_idx, closing = acc[-1]
if (opening, closing) in bracket_pairs:
acc.pop(), acc.pop()
has_match_at[opening_idx] = has_match_at[closing_idx] = True
longest_start, longest_end = 0, 0
most_recent_start = None
for left_idx, right_idx in zip(range(-1, n), range(0, n + 1)):
has_match_left = has_match_at[left_idx]
has_match_right = has_match_at[right_idx]
if (has_match_left, has_match_right) == (False, True):
most_recent_start = right_idx
if (has_match_left, has_match_right) == (True, False):
most_recent_end = right_idx
if most_recent_end - most_recent_start > longest_end - longest_start:
longest_start, longest_end = most_recent_start, most_recent_end
return brackets[longest_start:longest_end]
if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('brackets')
print(main(parser.parse_args().brackets))
We start off by doing the same thing as this week’s easy challenge, except we keep track of the indices of all of the matched brackets that we remove (opening or closing). We then identify the longest stretch of consecutive removed-bracket indices, and use that information to slice into the input to get the output.
For ease of implementation of the second part, I modelled the removed-bracket indices with a dict simulating a list indexed by [-1 … n + 1), with the values indicating whether the index corresponds to a matched bracket. The extra elements on both ends are always set to False. For example, {([])()[(])}()]
-> FFTTTTTTFFFFFTTFF
, and ([{}])
-> FTTTTTTF
. To identify stretches of consecutive indices, we can simply watch for when the value switches from False to True (start of a stretch), and from True to False (end of a stretch). We do that by pairwise-looping through the dict-list, looking for ‘FT’ and ‘TF’.
Interesting approach, but from my understanding of the code, it doesn’t generate matches like [()][()]
. I could be wrong, but I don’t see how you can get that by prepending, appending, and enclosing just ()
, []
, and/or {}
.
I’m also assuming that [()][()]
is supposed to be one of the results for n = 8. At least two others here seem to have made that assumption, and I believe it’s consistent with the previous challenge. Would be nice to have some clarification on this, though.
Here’s my Python solution. If my reasoning is all correct, it should (in theory) run in O(n * (number of matches)), which should be optimal since it takes at least that long to print out the results anyway. IF my reasoning is all correct, and my program is all correct.
closing_to_opening = {')': '(', ']': '[', '}': '{'}
def main(n: int) -> list[str]:
assert n % 2 == 0, f'n must be even; got {n=}'
acc: list = [(n // 2, None, None)]
for i in range(n):
new_acc = []
for closing_brackets_left, unmatched_series, full_series in acc:
if unmatched_series is not None:
most_recent_closing_bracket, rest = unmatched_series
matching_opening_bracket = closing_to_opening[most_recent_closing_bracket]
new_acc.append((closing_brackets_left, rest, (matching_opening_bracket, full_series)))
if closing_brackets_left > 0:
for bracket in [')', ']', '}']:
new_acc.append((closing_brackets_left - 1, (bracket, unmatched_series), (bracket, full_series)))
acc = new_acc
result = []
for _, _, series in acc:
series_as_list = []
for _ in range(n):
bracket, series = series
series_as_list.append(bracket)
result.append(''.join(series_as_list))
return result
if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('n')
result = main(int(parser.parse_args().n))
print('\n'.join(result))
Idea: The matches of size n are precisely the strings with n/2 closing brackets, n/2 opening brackets, and brackets arranged so that each closing bracket matches up with the opening bracket on the “top of the stack” when processing the string and removing matches. We build the strings backwards. For each match-in-construction, we track the number of closing brackets left to be added, the “stack” (but working backwards, so the roles of opening and closing brackets are reversed), and, of course, the actual string. We transform each match-in-construction into 1, 3, or 4 new matches-in-construction, adding one character at a time: the opening bracket that matches the closing bracket on the top of the stack (if any), and the three closing brackets (if we still have closing brackets to add). Because appending to strings is O(n) since we need to copy, and pushing and popping from Python lists creates mutable aliasing issues (and would take O(n) to copy, just like with strings), we do a Lisp and use cons cells to create lists instead (hence, the backwards building). I suspect it gives the same asymptotic runtime anyway, but I don’t actually know whether that’s true.
Here’s an O(n) solution using a stack instead of repeated search & replace:
closing_to_opening = {')': '(', ']': '[', '}': '{'}
brackets = input()
acc = []
for bracket in brackets:
if bracket in closing_to_opening:
if acc and acc[-1] == closing_to_opening[bracket]:
acc.pop()
else:
acc.append(bracket)
else:
acc.append(bracket)
print(''.join(acc))
Haven’t thoroughly thought the problem through (so I’m not 100% confident in the correctness of the solution), but the general intuition here is that pairs of brackets can only match up if they only have other matching pairs of brackets between them. You can deal with matching pairs of brackets on the fly simply by removing them, so there’s actually no need for backtracking.
Golfed, just for fun:
a=[]
[a.pop()if a and a[-1]==dict(zip(')]}','([{')).get(b)else a.append(b)for b in input()]
print(''.join(a))
(Attempt 2 at posting something like this, kbin was being weird last time)
Link to a YouTube video going through the main part of the newsletter, with links to other stuff in the description: https://youtu.be/wadIR3wjDfQ
My implementation is memoized by
functools.cache
, but that is a concern when it comes to recursive Fibonacci. That, and stack overflows, which are also a problem for my code (but, again, not for “reasonable” inputs – fibonacci(94) already exceeds 2^64).Time complexity-wise, I was more thinking about the case where the numbers get so big that addition, multiplication, etc. can no longer be modelled as taking constant time. Especially if
math.prod
andenumerate
are implemented in ways that are less efficient for huge integers (I haven’t thoroughly checked, and I’m not planning to).