Yeah, so I’m stupidly late to this game, but…
I think this points to a very important and tricky part of all this that I was thinking about when talking with @lspector about the GECCO tutorial.
There’s Push as a semi-abstract concept: Typed stacks, linear sequences of instructions, instructions reverting to NOOPs when they don’t work, timeouts or some such to catch infinite loops, answers taking from “appropriate” stack. There’s PushGP as a semi-abstract concept: Evolve linear sequences of instructions, possibly with a genome if we need to manage things like blocks or want to support things like gene silencing.
And then there’s the reality of Clojush, which is pretty much every random thought @lspector’s had in the last decade, or Klaupacius, which is pretty much every random thought @bill’s had in the last decade, but with tests. (And now there’s Pysh…you can fill in the blanks.)
If part of the goal of Push Redux is to bring new people into the community, then I think we need to be clear about the difference between those things, and help new visitors be clear about them as well. It’s easy to conflate Push and Clojush, and we tend to do that a lot without really thinking much about it, and one of the nice things about tools like Klaupacius and Pysh (and Ellen, etc.) is they help highlight those assumptions.
So I think we should be very explicit on something like Push Redux about what we think is really important about the ideas of Push, and what’s more “stuff that we implemented once and may or may not use much”.
Coming back to the questions of instructions, my first thought was that “Clojush instructions should be documented on the Clojush site, and Klaupacius instructions should be documented on the Klaupacius site, etc.” so we separate the big picture concepts and the implementation details.
My second thought was that maybe we should list at least the (Clojush) instructions that we have evidence matter. We could go through the zillion results files from @thelmuth’s dissertation (and there are probably similar files for @williamlacava) and pull out all instructions used in the simplified programs for winning individuals. We could then do frequency counts on those, and then highlight on Push Redux the instructions that seemed to come up fairly often in solutions across large numbers of problems. This would not be to suggest that those are the only instructions you need, or that other instructions in the current systems might not be useful (esp. for certain problems), or that there might be instructions we’ve never imagined (or implemented outside of Klaupacius, which has everything). But it would mean that every instruction we bother saying something about is an instruction that is demonstrably valuable.
That would require some work, but it would be useful work, and might ultimately turn into a paper or a note. (Maybe a student project and a student workshop paper?) I could imagine that a fancy job of this might introduce some useful clustering on instructions that tend to appear together, and then we could use something like SMAC to try to “tune” clusters of instructions to use on a problem.
Looking at it, though, there seem to still be several blanks in and amongst all the excellent work that’s been done there. Would it be useful to have a TODOs page that lists the known gaps so someone looking to contribute a little could look there. Or that can be handled using Github issues if people think that would be preferable.
There might be a reason to keep this, but I would agree that we want to remove it if it’s not likely to be used. The reason I’m thinking of is that most of Discourse is hidden from the public, and I there are times when I think we discover something that might be interest to a broader community but doesn’t necessarily rise to the level of a paper. If there was a blog here, though, then we could use that to share these “notes” (perhaps after working out the basics here on Discourse). For example, I could see a blog post about the fact adding the new-ish
dup_… instructions changed performance on the software synthesis problems since that’s not likely to rise to the level of a paper but might be interesting to others.