224 lines
10 KiB
Markdown
224 lines
10 KiB
Markdown
## Coleslaw: A Hacker's Guide
|
|
|
|
Here we'll provide an overview of key concepts and technical decisions
|
|
in *coleslaw* and a few suggestions about future directions. Please
|
|
keep in mind that *coleslaw* was written on a lark when 3 friends had
|
|
the idea to each complete their half-dreamed wordpress replacement in
|
|
a week. Though it has evolved considerably since it's inception, like
|
|
any software some mess remains.
|
|
|
|
## Core Concepts
|
|
|
|
### Data and Deployment
|
|
|
|
**Coleslaw** is pretty fundamentally tied to the idea of git as both a
|
|
backing data store and a deployment method (via `git push`). The
|
|
consequence is that you need a bare repo somewhere with a post-recieve
|
|
hook. That post-recieve hook
|
|
([example](https://github.com/redline6561/coleslaw/blob/master/examples/example.post-receive))
|
|
will checkout the repo to a **$TMPDIR** and call `(coleslaw:main $TMPDIR)`.
|
|
|
|
It is then coleslaw's job to load all of your content, your config and
|
|
templates, and render the content to disk. Deployment is done by
|
|
moving the files to a location specified in the config and updating a
|
|
symlink. It is assumed a web server is set up to serve from that
|
|
symlink. However, there are plugins for deploying to Heroku, S3, and
|
|
Github Pages.
|
|
|
|
### Blogs vs Sites
|
|
|
|
**Coleslaw** is blogware. When I designed it, I only cared that it
|
|
could replace my server's wordpress install. As a result, the code
|
|
until very recently was structured in terms of POSTs and
|
|
INDEXes. Roughly speaking, a POST is a blog entry and an INDEX is a
|
|
collection of POSTs or other content. An INDEX really only serves to
|
|
group a set of content objects on a page, it isn't content itself.
|
|
|
|
This isn't ideal if you're looking for a full-on static site
|
|
generator. Content Types were added in 0.8 as a step towards making
|
|
*coleslaw* suitable for more use cases but still have some
|
|
limitations. Any subclass of CONTENT that implements the *document
|
|
protocol* counts as a content type. However, only POSTs are currently
|
|
included on INDEXes since their isn't yet a formal relationship to
|
|
determine what content types should be included on which indexes.
|
|
|
|
### The Document Protocol
|
|
|
|
The *document protocol* was born during a giant refactoring in 0.9.3.
|
|
Any object that will be rendered to HTML should adhere to the protocol.
|
|
Subclasses of CONTENT (content types) that implement the protocol will
|
|
be seamlessly picked up by *coleslaw* and included on the rendered site.
|
|
|
|
All current Content Types and Indexes implement the protocol faithfully.
|
|
It consists of 2 "class" methods, 2 instance methods, and an invariant.
|
|
|
|
|
|
**Class Methods**:
|
|
|
|
Since Common Lisp doesn't have explicit support for class methods, we
|
|
implement them by eql-specializing on the class, e.g.
|
|
```lisp
|
|
(defmethod foo ((doc-type (eql (find-class 'bar))))
|
|
... )
|
|
```
|
|
|
|
- `discover`: Create instances for documents of the class and put them in
|
|
in-memory database with `add-document`. If your class is a subclass of
|
|
CONTENT, there is a default method for this.
|
|
- `publish`: Iterate over all objects of the class
|
|
|
|
|
|
**Instance Methods**:
|
|
|
|
- `page-url`: Generate a unique, relative path for the object on the site
|
|
sans file extension. An :around method adds that later. The `slug` slot
|
|
on the object is generally used to hold a portion of the unique
|
|
identifier. i.e. `(format nil "posts/~a" (content-slug object))`.
|
|
- `render`: A method that calls the appropriate template with `theme-fn`,
|
|
passing it any needed arguments and returning rendered HTML.
|
|
|
|
|
|
**Invariants**:
|
|
|
|
- Any Content Types (subclasses of CONTENT) are expected to be stored in
|
|
the site's git repo with the lowercased class-name as a file extension,
|
|
i.e. (".post" for POST files).
|
|
|
|
### Current Content Types & Indexes
|
|
|
|
There are 5 INDEX subclasses at present: TAG-INDEX, MONTH-INDEX,
|
|
NUMERIC-INDEX, FEED, and TAG-FEED. Respectively, they support
|
|
grouping content by tags, publishing date, and reverse chronological
|
|
order. Feeds exist to special case RSS and ATOM generation.
|
|
Currently, there is only 1 content type: POST, for blog entries.
|
|
|
|
I'm planning to add a content type PAGE, for static pages. It should
|
|
be a pretty straightforward subclass of CONTENT with the necessary
|
|
methods: `render`, `page-url` and `publish`. It could have a `url`
|
|
slot with `page-url` as a reader to allow arbitrary layout on the site.
|
|
The big question is how to handle templating and how indexes or other
|
|
content should link to it.
|
|
|
|
### Templates and Theming
|
|
|
|
User configs are allowed to specify a theme, otherwise the default is
|
|
used. A theme consists of a directory under "themes/" containing css,
|
|
images, and at least 3 templates: Base, Index, and Post.
|
|
|
|
**Coleslaw** uses
|
|
[cl-closure-template](https://github.com/archimag/cl-closure-template)
|
|
exclusively for templating. **cl-closure-template** is a well
|
|
documented CL implementation of Google's Closure Templates. Each
|
|
template file should contain a namespace like
|
|
`coleslaw.theme.theme-name`.
|
|
|
|
Each template creates a lisp function in the theme's package when
|
|
loaded. These functions take a property list (or plist) as an argument
|
|
and return rendered HTML. **Coleslaw** defines a helper called
|
|
`theme-fn` for easy access to the template functions. Additionally,
|
|
there are RSS, ATOM, and sitemap templates *coleslaw* uses automatically.
|
|
No need for individual themes to reimplement a standard, after all!
|
|
|
|
// TODO: Update for changes to compile-blog, indexes refactor, etc.
|
|
### The Lifecycle of a Page
|
|
|
|
- `(load-content)`
|
|
|
|
A page starts, obviously, with a file. When *coleslaw* loads your
|
|
content, it iterates over a list of content types (i.e. subclasses of
|
|
CONTENT). For each content type, it iterates over all files in the
|
|
repo with a matching extension, e.g. ".post" for POSTs. Objects of the
|
|
appropriate class are created from each matching file and inserted
|
|
into the an in-memory data store. Then the INDEXes are created by
|
|
iterating over the POSTs and inserted into the data store.
|
|
|
|
- `(compile-blog dir)`
|
|
|
|
Compilation starts by ensuring the staging directory (`/tmp/coleslaw/`
|
|
by default) exists, cd'ing there, and copying over any necessary theme
|
|
assets. Then *coleslaw* iterates over the content types and index classes,
|
|
calling the `publish` method on each one. Publish iterates over the
|
|
class instances, rendering each one and writing the result out to disk
|
|
with `write-page` (which should probably just be renamed to `write-file`).
|
|
After this, an 'index.html' symlink is created to point to the first index.
|
|
|
|
- `(deploy dir)`
|
|
|
|
Finally, we move the staging directory to a timestamped path under the
|
|
the config's `:deploy-dir`, delete the directory pointed to by the old
|
|
'.prev' symlink, point '.curr' at '.prev', and point '.curr' at our
|
|
freshly built site.
|
|
|
|
## Areas for Improvement
|
|
|
|
### Render Function Cleanup
|
|
|
|
There are currently 3 render-foo* functions and 3 implementations of the
|
|
render method. Only the render-foo* functions call `write-page` so there
|
|
should be some room for cleanup here. The render method implementations
|
|
are probably necessary unless we want to start storing their arguments
|
|
on the models. There may be a different way to abstract the data flow.
|
|
|
|
### User-Defined Routing
|
|
|
|
There is no reason *coleslaw* should be in charge of the site layout or
|
|
should care. If all objects only used the *slug* slot in their `page-url`
|
|
methods, there could be a :routing argument in the config containing
|
|
a plist of `(:class "~{format string~}")` pairs. A default method could
|
|
check the :class key under `(routing *config*)` if no specialized
|
|
`page-url` was defined. This would have the additional benefit of
|
|
localizing all the site routing in one place. New Content Types would
|
|
probably `pushnew` a plist onto the config key in their `enable` function.
|
|
|
|
### Better Content Types
|
|
|
|
Creating a new content type is both straightforward and doable as a
|
|
plugin. All that is really required is a subclass of CONTENT with
|
|
any needed slots, a template, a `render` method to call the template
|
|
with any needed options, a `page-url` method for layout, and a
|
|
`publish` method.
|
|
|
|
Unfortunately, this does not solve:
|
|
|
|
1. The issue of compiling the template at load-time and making sure it
|
|
was installed in the theme package. The plugin would need to do
|
|
this itself or the template would need to be included in 'core'.
|
|
Thankfully, this should be easy with *cl-closure-template*.
|
|
2. More seriously, there is no formal relationship between content
|
|
types and indexes. Consequentially, INDEXes include only POST
|
|
objects at the moment. Whether the INDEX should specify what
|
|
Content Types it includes or the CONTENT which indexes it appears
|
|
on is not yet clear.
|
|
|
|
### New Content Type: Shouts!
|
|
|
|
I've also toyed with the idea of a content type called a SHOUT, which
|
|
would be used primarily to reference or embed other content, sort of a
|
|
mix between a retweet and a del.icio.us bookmark. We encounter plenty
|
|
of great things on the web. Most of mine winds up forgotten in browser
|
|
tabs or stored on twitter's servers. It would be cool to see SHOUTs as
|
|
a plugin, probably with a dedicated SHOUT-INDEX, and some sort of
|
|
oEmbed/embed.ly/noembed support.
|
|
|
|
### Incremental Compilation
|
|
|
|
Incremental compilation is doable, even straightforward if you ignore
|
|
indexes. It is also preferable to building the site in parallel as
|
|
avoiding work is better than using more workers. Moreover, being
|
|
able to determine (and expose) what files just changed enables new
|
|
functionality such as plugins that cross-post to tumblr.
|
|
|
|
Git's post-receieve hook is supposed to get a list of refs on $STDIN.
|
|
A brave soul could update our post-receive script to figure out the
|
|
original hash and pass that along to `coleslaw:main`. We could then
|
|
use it to run `git diff --name-status $HASH HEAD` to find changed
|
|
files and act accordingly.
|
|
|
|
This is a cool project and the effects are far reaching. Among other
|
|
things the existing deployment model would not work as it involves
|
|
rebuilding the entire site. In all likelihood we would want to update
|
|
the site 'in-place'. Atomicity of filesystem operations would be a
|
|
reasonable concern. Also, every numbered INDEX would have to be
|
|
regenerated along with any tag or month indexes matching the
|
|
modified files. If incremental compilation is a goal, simply
|
|
disabling the indexes may be appropriate for certain users.
|