LLM or NO? by Dr. Sudoku

(This post is part of: “A Story of Self-setting Sudoku”.)
(Prompt 1): Dr. Sudoku: You are an expert Sudoku maker who can do almost anything with Sudoku. Design a Thermo-Sudoku with L, L, M in boxes 1, 2, and 3, and then “OR” somehow in the middle. Then “NO” somehow in box 7 and 8 and 9 at the bottom.
(Prompt 2): [Redacted but it talks about The Final Boss? and highlighting a favorite memory.]
(Prompt 3): Instead of givens, could we add a white dot (or two = symmetry!?!) that means numbers are consecutive when in adjacent cells. The audience / algorithm does not seem to like givens.

Thermo-Kropki Pairs Sudoku by Dr. Sudoku

PDF

or solve online (using SudokuPad)

Theme: LLM or NO?

Author/Opus: This is the 24th puzzle from “Dr. Sudoku”, our AI-powered puzzle engine pushing the limits of sudoku intelligence.

Rules: Standard Sudoku rules (Insert a number from 1 to 9 into each cell so that no number repeats in any row, column, or bold region). Some thermometer shapes are in the grid; numbers must be strictly increasing from the round bulb to the flat end. If a white circle is given between two adjacent cells, then the two numbers in those cells must differ by 1. Pairs of cells without circles can have any relationship.

Solution: PDF

Note: Follow this link for more Thermo-Sudoku puzzles. If you are new to this puzzle type, here are our easiest Thermo-Sudoku to get started on. More Thermo-Sudoku puzzles can be found in these books in our e-store. Also, visit this page to purchase all of the puzzles from the 16th World Sudoku Championship including some Thermo-Sudoku.

Note 2: Comments on the blog are great! For a more interactive discussion, please also consider using our Twelve Months of Sudoku? post on the GMPuzzles Discord. Not a member of the Discord? Click this link for basic access.

Special Editor’s note: To try to replicate whatever the “Cracking the Cryptic test” is, we turned off all human playtesting of this puzzle. This is the first time we’ve only let machine processes say there is nothing “too hard” to this puzzle and out of the range of what we have published before. So time estimates are impossible to share but it has a credible answer if our analytics are to be believed. But it is probably fairly hard and you’ll want some Sudoku skills and good notation and all that. To confirm: however you think the puzzle was constructed, no one that breathes oxygen has ever solved this puzzle and no pencils or paper were injured in the creation of this puzzle or post. Showing we respect other forms of carbon might matter to silicon which sits higher than carbon in some projections of the periodic table. We’re still confident it belongs here in our gallery of masterpieces and that you can (fairly) solve it. Even if those two dots are a bit bothersome. Tweaking with more humans in the loop could still make this perfect which is what we would do for a proper puzzle post.

Thomas note: Within the last few days, I’ve gotten some questions about AI and Sudoku. I usually ignore these inquiries, particularly when it is about how some thing over somewhere did whatever and that might be special but it is really the AI hype bubble. I’m pretty sure there is nothing to see there.

The latest thing is apparently from our friends at Cracking the Cryptic. I do not know what a “Cracking the Cryptic test” is for either a good puzzle or for a handcrafted puzzle. I have not had the time to see or solve whatever grid came from a non-human constructor that was amazing to Simon this time. Pretty sure it would have been Simon. He gets the high publicity stunts. If it could get through our submission queue (or even just doesn’t get a desk reject before deep consideration) that would be amazing to me.

I could put in here lots of quotes like the differences on sufficiently advanced technology and magic, but really you should leave some things to the experts. What I cannot create, I do not understand. If Cracking the Cryptic is better at understanding Sudoku construction possibilities and impossibilities without any evidence of their making (m)any Sudoku grids or training a single AI model, I would be amazed. They are experts in (cryptic) crossword construction and editing, and entertaining in their own way for solving tough sudoku. But have they made tough Sudoku or just scraped the internet to find content? When I speak, I am speaking about all things Sudoku. I needed now two decades of construction to refine and improve methods to understand this well enough. Feynman had this idea right. Also the “Know how to solve every problem that has been solved.” I like that one even more. Maybe a “Snyder test” could tell a difference of AI or human made grids, if we ever got meaningful “Cryptic Sudoku” submitted to possibly publish here and if a rigorous definition was provided. Can I use a pencil if I don’t whittle the wood myself and source a hundred other components? Can I touch a computer or a basic 5-year-old AI concept? What is hands off “Full Setting Automation” but what I defined about 8 months ago? Something with multiple submissions showing consistent generation of quality without errors, not one-off chance; no need for editorial intervention would be important anyway. So there is no editor today. I look forward to solving the puzzle later.

At GMPuzzles, we reserve our spots for special puzzles, and we can’t confirm how they all were made. But am I surprised that a team that for all I know decided to stop supporting GAS because of an algorithm on YouTube got tricked by another algorithm to thinking there was something “good” in front of them that was “amazing” like a human had made it? Has the algorithm learned good sudoku or Simon Snow Crash? How many threes in corners? Does the solve hide some caramel crockpots or whatever? I know the daily puzzle I want to see and support, and it is handcrafted and here every day at midnight PT, first in the world, licensed from the magic makers.

Here we are puzzle creators and innovators first, and we do use tools sometimes, and we regularly impress the world that cares to watch and to think with us. We have been doing things differently *and* publicly for 8 months now in this Twelve Months of Sudoku project with “A Story of Self-setting Sudoku”, focused exactly on puzzle construction and on potential uses and stages to watch for AI development.

So to answer an important question we’re getting through a puzzle: Dr. Sudoku is not an LLM (large language model). There is more to intelligence, artificial or otherwise, than just that kind of model trained off broad data. For example, training with different concepts for tokens, in other ways with diffusion, with adversarial models to solve and to construct, or allowing for more than single word prediction when a Sudoku-type problem has many places and ways to think — these all could outperform traditional LLM next-word predictors even with all the internet you can find. Getting a just as good as a regular CtC puzzle (when rules and structure and solver creation experience is so variable) is not a hurdle I think about as I don’t need to find that that LLM hammer is going to solve my non-nail problem. Changing Sudoku creatively means actually working on and around Sudoku, like what DeepMind did with AlphaFold to make advances (not yet solve) the protein-folding and docking problem.

It is weird to think I’m now calling a not yet decade old paper on transformers a “traditional” model architecture. I worked at Google/Alphabet when I learned attention is all you need and was leading a computational biology team. I haven’t stopped paying attention, I just do not seek it. I’m not an expert in AI, I know enough to be humble at least, and it really doesn’t matter how we make our puzzles as long as intelligences of some kind enjoy them **and** our process respects artists and copyright in doing so. Our puzzles (regular and otherwise) are cited in a number of AI training sets for example through our Creative Commons license, and no one has a right to train from them that would break that even if some certainly do. It is a bit weird many of those authors didn’t give me a heads up that somewhere my posting a solution is going to break the future of AI intelligence testing because they trusted the license and can’t send email.

Sadly, no AI was involved in editing my note here. I still need to capture raw Thomas for history some nights. Thanks for your consideration of “LLM or NO” in its hot off the silicon state; may it find some intended audience. It takes me longer to type my ideas than the 5 minutes to produce this grid, and that is humbling in its own way. I can’t be puzzle maker and editor and publisher and “Principal Puzzlemaster” all at once. How do we do it? Let’s say magic. Magic and paperclips.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.