Migrating different message types from Slack->Discourse

So I’m currently focusing on the migration of Slack content to Discourse and have a few questions.

Before those, though, a invitation to slap me silly and tell me to stop making things overly complicated if and when that makes sense. It’s clear that there are lots of details in Slack, and we don’t want/need all of them to migrate across to Discourse. I tend to get a little obsessive about that kind of thing, though, and could easily find myself spending hours chasing some detail that actually no one cares about.

With that said, all our messages on Slack have type “message”, but there are quite a few subtypes:

["bot_message", "channel_archive", "channel_join", "channel_leave",
 "channel_name", "channel_purpose", "file_comment", "file_mention",
 "file_share", "pinned_item"]

There is no subtype for the case that someone just posts a comment somewhere in a channel.

So going though these:

  • bot_message: These are all Github commits and the like. I assume we have no need to carry these forward to Discourse?
  • channel_*: These are mostly administrative notifications and I think we probably want to ignore many of them. (I doubt we need a record in Discourse of when people joined/left channels and the like.) I presumably do want to peel out the channel name and purpose info to include when I create the relevant topic, however.
  • file_*: These are when people attach a post, file, or snippet, or comment or refer to said thing. We definitely want these since that’s actually where a lot of the content is. I have several questions about these; see below.
  • pinned_item: This just indicates that an item was pinned, and we didn’t use that a lot in Slack. We could turn pinned Slack items into pinned Discourse items, or we could decide we don’t care. Thoughts on that?

Returning to the file_* events, in Slack it was possible to share a post, for example, in multiple channels, which would make comments on that post visible as messages in all of those channels. I don’t think we ever did that, so I think there’s actually a one-to-one relationship between posts and channels. Assuming that’s (very nearly) true, my inclination is to try to “collapse” the space of channels and the space of posts, just turning all the file_* events into “regular” posts in the relevant Discourse topic. We could generate a separate sub-category topic for posts, and then have each Slack post be a topic in that sub-category, but that smells like unnecessary fuss to me. I’m totally open to alternative opinions, however.

Arguably the Really Big Question is what to do with the actual posts/attachments/snippets. It turns out that the Discourse gem (and maybe the API, that’s less clear) doesn’t appear to support adding attachments to posts, which means I don’t have an easy way to automate that. Given that, a couple of options appear plausible (and there are others, but these seem to provide good combinations of effort-to-reward):

  • Just link back to the post on Slack at the top of the topic. This is super easy, but presumably not highly desirable given that the point is to escape Slack altogether.
  • Set up some place where we can “archive” those documents, and then have the links point to those archived copies.

The latter is probably the “nicest” option, but may not be trivial to make happen depending on the implementation details. The “right” way to do it would presumably be to have that set of files somewhere on the push-language.hampshire.edu box, but that requires at least:

  • There be a directory for those files.
  • There be a web server that can serve up those files.
  • I/my script (temporarily) has permission to write to that directory (which is definitely a potential security concern depending on how it’s handled).


Alternatively we could put the files in any of a number of file sharing things if we trust them to be around long enough. I could, for example, archive them up to a Google Drive folder and make @lspector and maybe a few others owners of so it’s not overly tied to me. Similar things could be done with Dropbox. Or we could create a Github repo whose only purpose would be to hold those archived files, and then link to those.


Discourse supports uploads

(Including to S3)

Anticipation of a small set of these complications was the reason that I previously advocated just dumping everything into whatever’s easiest to dump into, which we can move to Discourse, and which will allow us to find stuff by searching on Discourse, even if things are ugly and relatively hard to navigate once we find them.

I’d love to be able to find our slack posts in searches, but I think it’d be a mistake to put too much time into making this nice or even complete.

Overall, I’d say that the answer to almost all of your questions should be “do whatever is easiest”.

I’d ignore them.

Do posts that were just text entered into Slack appear as simple messages that you can pipe into Discourse? Or are only the comments that way? If the posts are, then I’d say just add them in with everything else.

Regarding attachments, I think the main things we attached were some CSVs and some figures – pretty minor stuff in general. I’d be fine just linking to the original files in Slack, knowing that if Slack goes down or whatever that those attachments are low-priority enough that we don’t mind losing them. If this is the easiest thing to do, that’s what I’d do!

1 Like

Unfortunately only comments come in this way. Posts, attachments, and snippets are all handled differently, much to my current annoyance :grimacing:

It turns out that there are only 37 attachments/posts/things:

  • 5 PDFs
  • 2 PNGs
  • 1 AVI
  • 1 CSV

The remaining 28 are text files, which includes posts and snippets, both of which are saved as ASCII text files. That’s fewer than I had realized, and a higher percentage of text that we can just inject into Discourse directly. So I think I’ll just try to inject the text files right into Discourse, and then attach the 8 other files by hand rather than continue to fiddle with this.

BTW, this gets increasingly non-general, which is kind of a bummer, but it’s clear that a general purpose translation/migration tool is way beyond the scope of this project given my to-do list.