I’ve been working on #pants for 4 months now – on and off, due to intermittent client and/or baby related work load. Still, I can’t believe it’s only been 4 months. Even within this short time frame, the project has gone through so many different iterations. They grow up so fast!
When I started out, I kept saying how I didn’t really know what #pants was and where it was headed. Four months later, I think I have a pretty good idea of what it is, so I thought it’d be fun (and potentially useful) to look back at its development so far and think about what worked (and what didn’t), and which parts of #pants I would do differently if I were to start from scratch. (Which I’m not going to; but maybe there’s a chance to get the most important or interesting bits implemented in the current code base.) Well, here we go.
Open Source from the start: yes, #pants has been open-source from its very first commit, something I’m very happy about. This is a lesson that I learned when working on sloblog.io; when I decided I wanted to open-source the project months after launching it, I realized that I didn’t really want to bother with cleaning up the code (having made the usual mistakes of committing secrets to the code base that should not be public, and other atrocities.) I didn’t want to make the same mistake with #pants.
I need to admit that I have a pretty ambivalent attitude towards open-sourcing my projects. In my experience, most people asking for something to be open-sourced are really asking for it to be provided for free, which is fine, and often to be easily installable on their (sometimes underpowered) servers, which is also fine to ask for, but very hard to achieve with complex software. Giving in to these requests in spite of reservations is a recipe for disappointment, which is one of the many reasons why, in #pants’ README, I’ve been paradoxically telling people not to install it.
And this is where I’m at with open-sourcing #pants right now. I do want to share the code, allowing others to learn, review, comment and possibly hack; obviously, I also do want others to be able to install the code on their own servers, duh; but I do not want to (and can’t) offer much in the way of installation support at this point.
Having said that, please rest assured that actively encouraging people to install their own copy of #pants of course is a cornerstone of its future, but the only officially supported deployment will be Docker-based, and I simply haven’t had the time to prepare decent Docker images.
Rails doesn’t have a great concurrency story; its server processes use humongous amounts of memory; the way gems (Ruby library packages) work is an invitation for unstopped bloat. So why did I pick it for #pants? Even with all these caveats, Rails is the framework that I’ve been using almost exclusively for most of the last decade; with the familiarity gained in this time plus all the amazing Rails gems available, I can whip up something over the course of a weekend that would probably take (me) multiple times longer with any other language or framework. #pants doesn’t have a high number of users yet, and none of them appear to be facing load issues, so I think we should be okay for a while. If #pants gets its 15 minutes and isn’t able to handle them, it’s going to be a fun problem to solve through a reimplementation in another language.
(Ironically, the best fit when it comes to improving #pants’ installation story would have been share-nothing style PHP – just upload the code to your web server and you’re good to go, I guess? But let’s just say that I’m not a fan.)
PostgreSQL vs. Document Databases: Rails uses PostgreSQL as its database, a RDBMS that I’ve recently fallen in love with after years of using MySQL. Seriously, if your app needs a relational database, give PostgreSQL a try, it is great; the extra features (like arrays) it offers allow you to model certain things much quicker and with less code.
However, four months into the project, I now wonder if a document database (aka NoSQL, etc.) would have been a better fit. From where I’m standing, the lack of a strict schema alone is a huge advantage in fast-evolving open-source projects; migrating relational database tables and integrating these migrations in deployment runs isn’t exactly rocket science in Rails, but explaining to users they need to run update scripts (even if it’s just a
rake db:migrate) can be. What I really want users to be able to do is to just update/replace the code (eg. Docker image), and things will just work. This probably one of the most trivial differences between relational and document databases, but here it would make a big difference.
In addition to this, I’ve been lustingly ogling RethinkDB for some time now. I haven’t actively used it in any projects so far, but what these guys are doing just feels and looks right, and I so want to give it a serious spin.
Another option may be CouchDB, which has some views on simplicity that strongly resonate with me, even though I’ve never used it in a real project so far, either.
Binary asset handling: Up to this point, #pants doesn’t really allow you to upload binary assets (post images etc.) beyond your user avatar. I was going to leave the storing of images et al to the users, but I realize now that #pants absolutely needs to support this stuff out of the box; a large part of social networking (and blogging in general) revolves around the sharing of images and other assets, and this stuff should be on the user’s server, so in the long run I can’t really get away with making users upload their stuff to Dropbox et al as I’m doing now.
The problem is that dealing with file uploads has always been and still is tricky, especially with open-source software that you’re expecting people to eventually install on their own servers. I’m using the excellent Dragonfly gem to deal with uploaded images and the like, and it’s easy to configure it to anything from using the local filesystem to putting everything on Amazon S3, but where do you strike the balance between convenience, flexibility and minimal external dependencies? The easiest configuration is to just use the local filesystem, but then you’ll be in trouble once you need or want to add more web servers; also, a bunch of potential hosting options (like Heroku) would not work. Amazon S3? It’s cheap, easy to set up and generally great, but it’s yet another external dependency that you need to set up (and sign up for); also, not everybody is keen on Amazon.
Ideally, I would like to put uploaded files in the same store that is holding all the other data; the database. Storing binary data in PostgreSQL isn’t difficult, but also not trivial. Its Large Objects aren’t natively supported by ActiveRecord, and I wasn’t entirely confident about the workaround-ish Dragonfly plugin that I built at some point; simply serializing files and splitting them up into chunks (by way of dragonfly-activerecord may be more reliable, but extremely smelly in more way than one.
Yes, you could probably hook up Dragonfly to some FTP server easily, but once again, the perfect option would simply work out of the box, using the same database that’s storing everything else. This may be another point in favor of a document database; MongoDB and CouchDB are said to have very solid binary data support (with Dragonfly plugins readily available), and RethinkDB appears to have at least some support, too.
GUIDs, URLs, slugs, oh my: this is really just a technical detail, but a tricky one. In an ideal world, a #pants post would simply have a single, globally unique ID: its URL. However, since post IDs are not supposed to change, this would completely obliterate all SEO efforts, as you could never add or change an existing post’s URL to something that’s more SEO-friendly. Even worse, someone switching their #pants instance to HTTPS would basically change all URLs of all their posts. Ouch.
So I ended up separating URLs from my own flavor of GUIDs, the latter being simply a URL, but without a protocol (eg. a post at the URL
http://hmans.io/foo123 would have a GUID of
hmans.io/foo123.) The idea here is that the URL is allowed to change (both its protocol and path), while the GUID should always stay the same (and, if necessary, redirect to the post’s current URL.)
If this confuses you, I don’t blame you. I’ve never been too happy with this setup; it feels like an overly convoluted solution for a problem that we didn’t really face yet. If I were to start over (or find the time to refactor most of #pants’ code), I would probably just keep two different URLs; one being the ID (staying the same over the lifetime of a post), the other being the human-facing, SEO-friendly URL. This would allow me to get rid of a heap of code converting URLs to GUIDs and back.
Indieweb: #pants is what the indieweb community would call “and indieweb implementation”, but I didn’t quite realize this until several weeks into the project. #pants is now making heavy use of Indieweb protocols and markup, but it didn’t do so from the beginning, and it absolutely should have. I’m not sure how this happened; I’ve been aware of indieweb for a while now, but didn’t think of embracing it with #pants when I started development.
Webmentions everwhere: Webmention is an extremely simple reboot of protocols like Trackback or Pingback and probably the most important thing to come out of Indieweb for #pants. Webmentions allow site A to tell site B that it has published something interesting, like a comment to a post. #pants uses Webmention liberally; when you post a reply to someone else’s post, their site receives a Webmention, allowing them to pull your comment to be able to display it beneath their post; when you follow another #pants site, it is notified through a Webmention; the upcoming Like feature uses it, too.
The great thing about this approach is that this allows #pants to easily communicate with non-#pants sites, as long as they also use Webmention. Once your #pants site receives a Webmention, it will pull and analyze the referenced source URL, no matter if it’s a #pants site, Wordpress, a static HTML page or whatnot, and depending on what it finds there, list the reply/fav/etc. underneath the referenced post. Just like my previous point about Indieweb, I only realized just how great this is very late in the project and would have saved a lot of time embracing it from the start. (Incidentally, #pants was using a very similar setup to allow for remote #pants instances to communicate; I called it a “Ping”, and while I eventually refactored the whole business to simply use Webmention, there’s still a
Ping model and a
Separate post types, pre-rendered body HTML: this is one of #pants’ features (or rather design decisions) that I’m especially happy about. Basically, the stuff that we’ve been using so far – text posts that contain Markdown which is then rendered to HTML – is just one of many (possible) types of post objects that can be published on your #pants site. The upcoming Like feature uses a second post type, and I want to make it trivially easy for people hacking #pants’ source code to add new types (think image gallery posts, reviews, polls, etc.) The great thing is that custom #pants post types will even work on remote #pants servers that don’t have the code powering them, as each post comes with a pre-rendered chunk of HTML that will just be displayed instead. The one thing that I would like to add to #pants is to have not one, but two pieces of pre-rendered HTML; one for the full post, one for the feed views, possibly containing a shorter version of the post.
That’s it for now, I think. I apologize for spamming your network timeline with such a long post! Let me finish this with saying how happy I am that so many people are enjoying #pants already. And remember, if you want to give it a try, here’s how!