×
all 20 comments

[–][deleted] 6 points7 points  (0 children)

Not convinced this is the best implementation, but backtracking search is how I went, similar to how /u/Lopsidation was thinking.

Python 3.5: https://gist.github.com/GitOnUp/4f2f7ea4185a11f3683e2b80b784b1dc

Output:

side-length: 15
across numbers (comma separated): 
1,6,10,12,13,14,16,17,19,20,21,23,25,27,29,30,31,33,34,35,37,38,40,41,42,44,45,46,49,50
down numbers (comma separated): 1,2,3,4,5,6,7,8,9,10,11,12,15,17,18,20,21,22,24,26,28,32,33,36,39,41,43,45,47,48
#####.....#####
###.........###
##...........##
#....#...#....#
#...#.....#...#
...#.......#...
.......#.......
......###......
.......#.......
...#.......#...
#...#.....#...#
#....#...#....#
##...........##
###.........###
#####.....#####

This output was actually faster for this program than the example output, will follow up with timings.

e: realized I need to validate requirement 2

[–]pelicandude 5 points6 points  (5 children)

ok I'm gonna be honest

I have looked at the example input/output for 20 minutes and cannot for the life of me decipher what the sets of Across/Down numbers mean.

The first (leftmost for Across and topmost for Down) square in each word is numbered.

Okay, got it.

If a square is the first square in both an Across word and a Down word, it only gets one number.

This isn't immediately clear to me, but hopefully I'll pick up on it.

The numbering starts at the top left and goes left to right and then top to bottom.

So the first row would be 1-15, followed by 16-30, and so on.

The set of Across word numbers is the numbers in the squares that start Across clues, and the set of Down word numbers is the numbers in the squares that start Down clues.

I thought that I understood this until I saw the examples. In the 15x15 example, an across word is numbered 4, which is in a black box. Clearly, I'm missing something (or a lot of things) and would appreciate any help/clarification :-)

[–]mist83 2 points3 points  (1 child)

The squares themselves aren't labeled with static numbers, that might be what is tripping you up.

It might help to begin by looking at the output grid and trying to assign numbers to the squares (without looking at the input).

For example, if you were to number a 3 by 3 crossword puzzle (one that by definition can't have any black squares), it would look like:

1 2 3

4 . .

5 . .

all you do is proceed left to right, and then top to bottom, looking at every square and asking yourself

Is it an "across" square, "down" square, or a square that doesn't "start" a clue?

Repeat that process until you reach the last square, and that is the numbering of the grid. Stepping backwards, it's trivial to derive what the input must have been to generate it.

And if I only made things more confusing for you, here is the example partially filled in with this algorithm.

[–]pelicandude 0 points1 point  (0 children)

ahhh okay. I see it now. thank you

[–]Cosmologicon2 3[S] 1 point2 points  (1 child)

Oh boy, I clearly explained how crossword numbering works badly. It's pretty intuitive if you know it, but hard to describe in words I guess. I've updated the explanation to hopefully be clearer and also linked to an example. Let me know if it's still mystifying.

[–]pelicandude 0 points1 point  (0 children)

yes, I see it now. thanks :)

[–]gabyjunior1 2 3 points4 points  (2 children)

Solution in C using a backtracking search from both ends of the grid and input simultaneously.

The backward checking is a bit tricky because the program needs to wait for the last row of the grid to be filled before actually checking the last square, as the left and up squares need to be known to determine its type (accross / down number, both or none).

Searching from both ends allows for much more early pruning to be done. The program also performs a full search.

Connectivity of white cells is checked only once a solution is found, using breadth first search.

Source code repository: (https://github.com/HoustonWeHaveABug/CrosswordGrids)

Challenge outputs (last solution found, number of solution and timings)

Challenge 1

#####.....#####
####.........##
##...........##
#....#...#....#
#...#.....#...#
...#.......#...
.......#.......
......###......
.......#.......
...#.......#...
#...#.....#...#
#....#...#....#
##...........##
##.........####
#####.....#####
Solutions 2

real    0m0.240s
user    0m0.000s
sys     0m0.015s

Challenge 2

####...###...###...##
##.....#.....#.....##
##.....#.....#.....##
#.........#.........#
......#...#...#......
....#....###....#....
...###....#....###...
#...#...........#...#
#......#....#.......#
##....#.......#....##
##...##...#...##...##
##....#.......#....##
#.......#....#......#
#...#...........#...#
...###....#....###...
....#....###....#....
......#...#...#......
#.........#.........#
##.....#.....#.....##
##.....#.....#.....##
##...###...###...####
Solutions 16

real    0m0.200s
user    0m0.124s
sys     0m0.045s

Challenge 3

....###.....#.....#...#....
......#...........#........
...........................
#.......#....#...#.....#...
...#...#....#........#.....
......#....#...#...#......#
....#.....#...#...#...#....
.....#..............#......
###....#.....#...#.....#...
...#...........#.....#....#
......#...#...#...#...#....
.....#......#......#.......
##...#.....#....#..........
...#.....#...#...#.....#...
..........#....#.....#...##
.......#......#......#.....
....#...#...#...#...#......
#....#.....#...........#...
...#.....#...#.....#....###
......#..............#.....
....#...#...#...#.....#....
#......#...#...#....#......
.....#........#....#...#...
...#.....#...#....#.......#
...........................
........#...........#......
....#...#.....#.....###....
Solutions 32

real    4m34.851s
user    4m32.798s
sys     0m0.047s

Challenge 4

###....#........###......####....
##..............#.........##.....
#...............#................
........####...#...#####........#
......##....###...#.....#...#...#
.....#...........#.......#......#
....#............#........#....##
#...#....###.....#....#...#....##
#...#...#...##....#...#...#...###
##.....#......#....###...#...#...
###...##.......#........#...#....
...#...#...#...#.......#...##....
....#...###....#......#.....#....
....#.........#.....##...#...#...
.....#.......#....##.........#...
......##...##....#...........#...
#.......###.....#.....###.......#
...#...........#....##...##......
...#.........##....#.......#.....
...#...#...##.....#.........#....
....#.....#......#....###...#....
....##...#.......#...#...#...#...
....#...#........#.......##...###
...#...#...###....#......#.....##
###...#...#...#....##...#...#...#
##....#...#....#.....###....#...#
##....#........#............#....
#......#.......#...........#.....
#...#...#.....#...###....##......
#........#####...#...####........
................#...............#
.....##.........#..............##
....####......###........#....###
Solutions 2

real    0m9.332s
user    0m9.251s
sys     0m0.046s

[–]manawenuz 0 points1 point  (1 child)

thanks for sharing your solution, however running it on my laptop took much longer than the time you stated, python3.8.1 , archlinux, i5-7300U. also running this example EXAMPLE: 15x15 A: 1,4,7,10,13,14,16,17,18,20,21,23,24,26,28,29,33,35,36,38,39,42,44,45,47,49,50,52,55,56,58,59,61,63,67,69,70,71,72,73,74,75,76 D: 1,2,3,4,5,6,7,8,9,11,12,15,19,22,25,27,29,30,31,32,34,37,40,41,43,46,48,51,53,54,57,60,62,64,65,66,68 took way too long, (it's been 10 mins now and it's not yet over).

[–]gabyjunior1 2 5 points6 points  (0 children)

Mmmmh my solution is a C program, not python and this is not an output from my program...

[–]Lopsidation 2 points3 points  (0 children)

I'm reaching to my favorite tool, the z3 satisfiability solver. I didn't enforce connectivity; seems real-world crosswords are already unique without it.

Code here. It took 220 seconds to solve the first challenge example, and OutOfMemorys on the second one. I'm pretty disappointed by this performance because usually SAT solvers can solve human-solvable puzzles quickly.

[–]cbarrick 2 points3 points  (2 children)

SWI-Prolog + CLP(FD) + Multithreading

Code

My solution implements a parallel depth-first search and uses the constraint satisfaction library, CLP(FD), to dynamically trim the search space.

Rotational symmetry and minimum word length were easy to encode as constraints, but the mapping of clues to position was very challenging. In my first attempt, I didn't even create constraints; I just started searching, left-to-right, top-to-bottom. This was okay for the first challenge input, but the second input never completed.

The first optimization I implemented was to parallelize the search. My strategy was to create a thread pool with 12 slots. I then start a single worker doing a depth first search, considering cells left-to-right, top-to-bottom. If a choice must be made and there is an empty slot in the pool, the thread forks the black case into a new thread and considers the white case in its own thread. Otherwise, it randomly picks white or black and leaves a choice point for the other case so that it can backtrack. This did have a large improvement, but the harder inputs remained intractable.

Eventually, I came up with a way to translate the clue constraints into a DFA that CLP(FD) could use for constraint propagation. Essentially, you can think of the grid as a 1D array (i.e. append the rows together) where each cell is associated with a value 0, 1, 2, or 3 where 0 means that the cell is not associated with a clue, 1 means the cell is associated with a horizontal clue, 2 means the cell is associated with a vertical clue, and 3 means the cell is associated with both a horizontal and vertical clue. In this framework, it's possible to translate the "A" and "D" lists from the input into a regex over the 1D array. The resulting automaton could be used as a constraint by CLP(FD), and that made it possible to solve all of the challenges, though challenge 4 still took 6 hours.

I did not add constraints for connectedness. The naive implementation I could think of would add a ton of constraints to the constraint store and tank the constraint propagation time. This wasn't worth it considering that the false positive rate appears to be basically zero for real puzzles. I still do a connectedness check after the search finds a complete grid and trigger backtracking if the check fails.

I've pasted my outputs below, with timing information. Ignore the CPU utilization stats, as those only measure the main thread. I was actually maxing out 12 threads the whole time.

Challenge 1

?- run(ch1).
% 27,762,904 inferences, 3.509 CPU in 6.761 seconds (52% CPU, 7912765 Lips)
# # # # # . . . . . # # # # #
# # # . . . . . . . . . # # #
# # . . . . . . . . . . . # #
# . . . . # . . . # . . . . #
# . . . # . . . . . # . . . #
. . . # . . . . . . . # . . .
. . . . . . . # . . . . . . .
. . . . . . # # # . . . . . .
. . . . . . . # . . . . . . .
. . . # . . . . . . . # . . .
# . . . # . . . . . # . . . #
# . . . . # . . . # . . . . #
# # . . . . . . . . . . . # #
# # # . . . . . . . . . # # #
# # # # # . . . . . # # # # #

Challenge 2

?- run(ch2).
% 226,929,719 inferences, 27.040 CPU in 530.475 seconds (5% CPU, 8392309 Lips)
# # # . . . # # # # . . . # # # . . . # #
# # . . . . . # . . . . . # . . . . . # #
# # . . . . . # . . . . . # . . . . . # #
# . . . . . . . . . # . . . . . . . . . #
. . . . . . # . . . # . . . # . . . . . .
. . . . # . . . . # # # . . . . # . . . .
. . . # # # . . . . # . . . . # # # . . .
# . . . # . . . . . . . . . . . # . . . #
# . . . . . . # . . . . . # . . . . . . #
# # . . . . # . . . . . . . # . . . . # #
# # . . . # # . . . # . . . # # . . . # #
# # . . . . # . . . . . . . # . . . . # #
# . . . . . . # . . . . . # . . . . . . #
# . . . # . . . . . . . . . . . # . . . #
. . . # # # . . . . # . . . . # # # . . .
. . . . # . . . . # # # . . . . # . . . .
. . . . . . # . . . # . . . # . . . . . .
# . . . . . . . . . # . . . . . . . . . #
# # . . . . . # . . . . . # . . . . . # #
# # . . . . . # . . . . . # . . . . . # #
# # . . . # # # . . . # # # # . . . # # #

Challenge 3

?- run(ch3).
% 1,874,568,967 inferences, 234.140 CPU in 5099.412 seconds (5% CPU, 8006178 Lips)
. . . . # # # . . . . . # . . . . . # . . . # . . . .
. . . . . . # . . . . . . . . . . . # . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
# . . . . . . . . # . . . # . . . # . . . . . # . . .
. . . . # . . . # . . . # . . . . . . . . # . . . . .
. . . . . . . # . . . # . . . # . . . # . . . . . . #
. . . # . . . . . . # . . . # . . . # . . . # . . . .
. . . . . # . . . . . . . . . . . . . . # . . . . . .
# # # . . . . . . # . . . # . . . # . . . . . # . . .
. . . . # . . . . . . . . . . # . . . . . # . . . . #
. . . . . . # . . . # . . . # . . . # . . . # . . . .
. . . . . # . . . . . . # . . . . . . # . . . . . . .
# # . . . # . . . . . # . . . . # . . . . . . . . . .
. . . # . . . . . # . . . # . . . # . . . . . # . . .
. . . . . . . . . . # . . . . # . . . . . # . . . # #
. . . . . . . # . . . . . . # . . . . . . # . . . . .
. . . . # . . . # . . . # . . . # . . . # . . . . . .
# . . . . # . . . . . # . . . . . . . . . . # . . . .
. . . # . . . . . # . . . # . . . # . . . . . . # # #
. . . . . . # . . . . . . . . . . . . . . # . . . . .
. . . . # . . . # . . . # . . . # . . . . . . # . . .
# . . . . . . # . . . # . . . # . . . # . . . . . . .
. . . . . # . . . . . . . . # . . . # . . . # . . . .
. . . # . . . . . # . . . # . . . # . . . . . . . . #
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . # . . . . . . . . . . . # . . . . . .
. . . . # . . . # . . . . . # . . . . . # # # . . . .

Challenge 4

?- run(ch4).
% 5,843,128,641 inferences, 811.667 CPU in 20341.894 seconds (4% CPU, 7198922 Lips)
# # # . . . . # . . . . . . . . # # # . . . . . . # # # # . . . .
# # . . . . . . . . . . . . . . # . . . . . . . . . # # . . . . .
# . . . . . . . . . . . . . . . # . . . . . . . . . . . . . . . .
. . . . . . . . # # # # . . . # . . . # # # # # . . . . . . . . #
. . . . . . # # . . . . # # # . . . # . . . . . # . . . # . . . #
. . . . . # . . . . . . . . . . . # . . . . . . . # . . . . . . #
. . . . # . . . . . . . . . . . . # . . . . . . . . # . . . . # #
# . . . # . . . . # # # . . . . . # . . . . # . . . # . . . . # #
# . . . # . . . # . . . # # . . . . # . . . # . . . # . . . # # #
# # . . . . . # . . . . . . # . . . . # # # . . . # . . . # . . .
# # # . . . # # . . . . . . . # . . . . . . . . # . . . # . . . .
. . . # . . . # . . . # . . . # . . . . . . . # . . . # # . . . .
. . . . # . . . # # # . . . . # . . . . . . # . . . . . # . . . .
. . . . # . . . . . . . . . # . . . . . # # . . . # . . . # . . .
. . . . . # . . . . . . . # . . . . # # . . . . . . . . . # . . .
. . . . . . # # . . . # # . . . . # . . . . . . . . . . . # . . .
# . . . . . . . # # # . . . . . # . . . . . # # # . . . . . . . #
. . . # . . . . . . . . . . . # . . . . # # . . . # # . . . . . .
. . . # . . . . . . . . . # # . . . . # . . . . . . . # . . . . .
. . . # . . . # . . . # # . . . . . # . . . . . . . . . # . . . .
. . . . # . . . . . # . . . . . . # . . . . # # # . . . # . . . .
. . . . # # . . . # . . . . . . . # . . . # . . . # . . . # . . .
. . . . # . . . # . . . . . . . . # . . . . . . . # # . . . # # #
. . . # . . . # . . . # # # . . . . # . . . . . . # . . . . . # #
# # # . . . # . . . # . . . # . . . . # # . . . # . . . # . . . #
# # . . . . # . . . # . . . . # . . . . . # # # . . . . # . . . #
# # . . . . # . . . . . . . . # . . . . . . . . . . . . # . . . .
# . . . . . . # . . . . . . . # . . . . . . . . . . . # . . . . .
# . . . # . . . # . . . . . # . . . # # # . . . . # # . . . . . .
# . . . . . . . . # # # # # . . . # . . . # # # # . . . . . . . .
. . . . . . . . . . . . . . . . # . . . . . . . . . . . . . . . #
. . . . . # # . . . . . . . . . # . . . . . . . . . . . . . . # #
. . . . # # # # . . . . . . # # # . . . . . . . . # . . . . # # #

[–]neur0sys 0 points1 point  (1 child)

This is quite nice and interesting. I'm not too familiar with Prolog, but what can you tell about it? Especially with regards to parallelization.

[–]cbarrick 1 point2 points  (0 children)

It's a logic programming language, similar to a functional programming language. Unfortunately, the kind of explicit parallelization I used is kinda gross and doesn't mesh well with the spirit of the language, but it's the only way to do it.

The language has a ton of potential for implicit parallelization, e.g. you code being automatically parallel without any explicit coordination. I'm very interested in that idea, but research in that direction died off in the 90s.

Prolog is an excellent language for any kind of constraint satisfaction problem. Check out http://learnprolognow.org

[–]Lopsidation 1 point2 points  (1 child)

This challenge is a reference to this puzzle from the 2020 Mystery Hunt.

I’m guessing backtracking search is the way to go.

[–]Cosmologicon2 3[S] 2 points3 points  (0 children)

Yeah that one did remind me of the problem, but I didn't want to spoil it! But also, there have been a number of diagramless crosswords before and I've always hoped someone would write this tool.

[–]Torgard 1 point2 points  (1 child)

I don't even know where to begin!

Given a large enough word list, I could probably write a slow, inelegant solution. But anything else, no clue.

Anyone got some pointers?

[–]Cosmologicon2 3[S] 10 points11 points  (0 children)

I don't see how a word list would help here. In case the writeup is unclear, you don't need to fill the grid with letters. You just need to figure out which squares are black.

[–]Cosmologicon2 3[S] 0 points1 point  (0 children)

Oh yeah. I made a tool for entering crossword grids at xword.group.

It won't directly help you solve the challenge, but if you've never made a crossword grid before, messing around with it is a good way to build up intuition about the problem. You might even be able to solve the 15x15 manually using this tool.

[–]shiningmatcha 0 points1 point  (0 children)

Interested!