Jeremy Dormitzer's blog

State of unifyDB: 2021

Wed, 10 Nov 2021 19:00:00 EST

2021 is almost over! And it's been over a year since I've written anything about my work-in-progress graph database unifyDB. BUT just because I'm bad at blogging doesn't mean I haven't made any progress. In fact, a bunch of exciting stuff happened for unifyDB in 2021, and I'm going to info-dump it all on you 🙃.

Aggregation, sorting and limiting

This one was really exciting, as it marked a huge step towards making unifyDB truly useful. I added the ability to aggregate, sort, and limit query results. Here's what the syntax looks like:

{:find [?role (min ?age)]
 :where [[_ :employee/role ?role]
         [_ :employee/age ?age]]
 :sort-by [(min ?age) :desc]
 :limit 5}

Aggregate expressions appear in parentheses (like a function call) in the find clause - the (min ?age) in this example. This query will return five biggest minimum ages for every job role in a database of employee data.

One feature of this system I'm particularly proud of is implicit grouping. Take a look at that find clause again: [?role (min ?age)]. We are asking for both a non-aggregated variable, ?role, and an aggregated one ?age. In a SQL query, we'd need to specify how construct groups of roles before we can find the minimum age of each group using a GROUP BY clause:

SELECT role, min(age)
FROM employees
GROUP BY role

The unifydDB query engine is smart enough to figure out that we need to group the result set by role before finding the minimum age of each group. If we add additional scalar or aggregate variables in the find clause, the query engine will automatically construct the appropriate groups such that all scalar variables have single values in each group.

Entity-level transactions and "pull" queries

unifyDB is technically a tuplestore - that is, its core unit of data is a "fact" tuple consisting of [entity attribute value] pairs. An entity is therefore represented as a set of data tuples:

[[1 :name "Ben Bitdiddle"]
 [1 :age 43]
 [1 :role "Software Engineer"]]

This data format makes unifyDB highly flexible, able to answer not only questions about entities ("who has the software engineer role") but also questions about attributes ("what's the median age across all employees") and values. However, this flexibility comes at a cost: most programs (and programmers!) think in terms of entities. They talk about data shaped more like this:

{:id 1
 :name "Ben Bitdiddle"
 :age 43
 :role "Software Engineer"}

Requiring developers to transform entity-oriented data into fact-oriented data to fit unifyDB's internal data model imposes unnecessary cognitive load and violates one of unifyDB's core principles: meet the programmer where they are. In answer to this I added two features: entity transactions and "pull" queries.

Entity transactions allow data to enter the database in the shape of an entity map rather than as a set of facts. Here's how that looks:

(transact! db
 [{:unifydb/id "alyssa"
   :name "Alyssa P. Hacker"
   :age 37
   :role {:title "Engineering Manager"
          :salary 60000}}
  {:unifydb/id "ben"
   :name "Ben Bitdiddle"
   :age 43
   :supervisor "alyssa"
   :role {:title "Software Engineer"
          :salary 40000}}])

This transaction creates four new entities in the database: two employees and two roles (note the use of temporary IDs to map relationships between entities in the transaction). This transaction would expand to a set of facts that looks like this:

[;; Alyssa
 ["alyssa" :name "Alyssa P. Hacker"]
 ["alyssa" :age 37]
 ["alyssa" :role "role1"]
 ;; Alyssa's role
 ["role1" :title "Engineering Manager"]
 ["role1" :salary 60000]
 ;; Ben
 ["ben" :name "Ben Bitdiddle"]
 ["ben" :age 43]
 ["ben" :supervisor "alyssa"]
 ["ben" :role "role2"]
 ;; Ben's role
 ["role2" :title "software Engineer"]
 ["role2" :salary 40000]]

This set of facts then gets transacted into the database normally, resolving the temporary ids ("alyssa", "ben", "role1", "role2") and making the facts available to be queried. As you can see, nested entities get flattened to be their own fact sets as well. This transformation maintains the flexibility of a fact-oriented data architecture while allowing developers to think in entity-oriented terms. And of course, for data that isn't inherently entity-oriented, raw fact tuples can still be transacted as usual.

On the query side, the new "pull" feature adds a similarly entity-oriented way to make queries. This feature adds new syntax to the find clause that allows users to specify the shape of data they want to return. It's probably easiest to understand with an example:

{:find [(pull ?e [:name :age {:role [:title :salary]}])]
 :where [[?e :name "Alyssa P. Hacker"]]}

Given the data used in the entity transaction example above, this query would return:

[[{:name "Alyssa P. Hacker"
   :age 37
   :role {:title "Engineering Manager"
          :salary 60000}}]]

Let's pull apart the pull syntax (heh, see what I did there?). A pull query is a list of attributes to return, with nested entities represented as sub-maps within the list. So the query (pull ?e [:name :age {:role [:title :salary]}]) is asking for the :name and :age values for some entity ?e, as well as the :title and :salary attributes of the entity whose id is the value of ?e's :role attribute. This system effectively separates the logic of finding the entity you want (in this case "the entity with :name "Alyssa P. Hacker"") from the logic of specifying which attributes of that entity you care about. This system also returns the data in the entity-oriented format that the rest of your program is already using.

Taken together, these two new features allow unifyDB to function as a document store in addition to its existing utility as a tuplestore. That's a huge boost for its use as a general-purpose application database.

unifyDB presentation at Boston Clojure Meetup

I gave a talk on unifyDB for the Boston Clojure Meetup! Due to my aforementioned lack of blogging, this is now the most in-depth look at the database available. I covered a brief history of the project, gave a demo of its capabilities at the time of recording, and answered some questions about the codebase. Do note that this was recorded before I added most of the features discussed in this post, so parts of the demo are slightly out of date. But on the whole it's still a worthwhile showcase of some of unifyDB's core features.

Here's the full recording:

What's next?

If you made it this far, thanks for sticking with me! I'm really excited about the improvements that came to unifyDB in 2021. But there's still quite a bit more to do before we can consider this thing released. In no particular order:

the other flagship feature, built-in access management. Alongside historical queries (which is already implemented), this is the key problem I'm trying to solve with unifyDB. The access management feature will allow fine-grained access control, letting engineers enforce rules like "only authorized admins can see personally-identifying customer data" without needing to write custom code
a built-in distributed key-value store. Right now unifyDB sits on top of existing key-value stores to provide the persistence layer. Before I consider this project finished it'll need to ship with a built-in distributed persistence layer
codebase improvements: there are a number of changes I want to make to the existing codebase. On the top of this list is fixing my usage of the Manifold library - it provides a nice async abstraction layer but in many places I'm turning an async call into a blocking call by unwrapping a Manifold deferred instead of mapping over it. I'd also like to add better error handling and more end-to-end tests

Hopefully I'll be a bit more public with my development efforts in 2022. I'll try to post more frequent (and shorter!) blog posts. I'll also be posting more actively into the #unifydb channel on the Clojurians Zulip chat, so check that out if you want to follow my progress.

Building a purely-functional static site generator

Sun, 27 Dec 2020 19:00:00 EST

Ok, I know. That was kind of a lie. No static site generator can ever really be purely functional, since the side effects are the whole point. But I think I found a way to build a site generator that retains all the benefits of a purely functional architecture - simplicity, flexibility, and hackability.

Let me back up. I have been looking into new technology for my website for a while now. Right now I'm using a very capable site generator called Pollen, but it has started to feel too complicated for my needs. I found Gatsby.js, and while it ticks most of the right boxes (able to source content from multiple sources at compile time, pluggable with a huge plugin ecosystem), it still has a ton of features I'm never going to use and feels over-architected for what should be a simple solution.

So I decided to build my own static site generator. I'm calling it Obelix, and it aims to combine the best parts of Gatsby with a stripped-down, simple architecture. This blog post was rendered in it! In this post, I'm going to give a brief overview of how Obelix works and talk about why I built it this way.

The big picture

Obelix uses a simple internal data structure to represent the contents of a static site:

{:metadata {}
 :routes []}

:metadata holds a dictionary of arbitrary metadata about the site as a whole, stuff like the copyright date or the last updated timestamp. :routes is a list of all the site's static pages. If the site consists of three routes — index.html, blog/post-1.html, blog/post-2.html — then the :routes list might look like this:

[{:name "index.html"
  :type :page
  :content "Content here"}
 {:name "blog/post-1.html"
  :type :page
  :content "More content here"}
 {:name "blog/post-2.html"
  :type :page
  :content "So much content!"}]

As you can see, the elements of the :routes list are nodes representing the asset that lives at that URL. Asset maps can have whatever keys are necessary to render that asset.

The heart of Obelix is a pipeline of handler functions. A handler function takes in a site map and does something with it — add a key, transform a node, write stuff out to disk. Handler functions are added via plugins, which are simply modules that provide handler functions to be run at various points during the build pipeline. Obelix comes with several core plugins that always run during the build process, and more can be added via third-party or project-specific plugins.

The plugins are where all of the actual behavior of the site generator lives. For example, one plugin reads Markdown-formatted files from disk, parses them, and adds them as routes in the site list. Another plugin walks the routes, transforms the pages to text, and writes them to disk in the output directory.

The beauty of this functional approach is that it is capable of supporting basically any feature offered by other static site generators, but those features can be implemented by plugins outside the core of the generator itself. A templating engine, for example, where template files in the source directory get applied to multiple pages in the output site, can be implemented as a plugin that wraps some of the routes in the site map with new content.

I'm really happy with how Obelix turned out. It's available for installation on NPM and the full source code is available on GitHub. If you’re interested in contributing plugins or want to use Obelix for your own site, let me know on Twitter!

unifyDB Dev Diary 1: the query system

Fri, 2 Oct 2020 20:00:00 EDT

This is the first development diary for the database I'm writing, unifyDB. I wrote a brief introduction to the project here. In this post I'm going to talk about unifyDB's query system: what it does and how it works.

I want to start with an example of a unifyDB query, but to understand that we need to understand a bit about how unifyDB represents data. All data in unifyDB is stored as a collection of facts. A fact is a tuple with three pieces of information: an entity ID, an attribute name, and a value (actually, a fact has two additional fields, a transaction ID and an added? flag, but we won't worry about those until we talk about time-traveling queries, which deserves its own blog post). For example, we might represent some user records with the following set of facts:

(1, "username", "harry")
(1, "role", "user")
(1, "preferred-theme", "light")
(2, "username", "dumbledore")
(2, "role", "user")
(2, "role", "admin")
(2, "preferred-theme", "light")
(3, "username", "you-know-who")
(3, "role", "user")
(3, "role", "user")
(3, "role", "admin")
(3, "preferred-theme", "dark")

This corresponds with the following records in a more conventional JSON format:

[
    {
        "id": 1,
        "username": "harry",
        "role": ["user"],
        "preferred-theme": "light"
    },
    {
        "id": 2,
        "username": "dumbledore",
        "role": ["user", "admin"],
        "preferred-theme": "light"
    },
    {
        "id": 3,
        "username": "you-know-who",
        "role": ["user", "admin"],
        "preferred-theme": "dark"
    }
]

(The astute reader will notice that there’s not actually a way to specify using a set of facts that "role" is a list but "preferred-theme" is a scalar value, i.e. the cardinality of an attribute. This requires another database feature, attribute schemas, that I’m going to save for another blog post.)

With that under our belt, let's take a look at an example unifyDB query. The unifyDB server understands query written in extensible data notation, but database clients for different programming languages will allow developers to write queries that feel native to that language. Here's a query in EDN format:

{:find [?username]
 :where [[?e :preferred-theme "light"]
         [?e :username ?username]]}

This query says, “find me the values of all the username attributes of entities whose preferred-theme is "light"”. If we run this query on the set of facts given above, it would return:

[["harry"]
 ["dumbledore"]]

Note that the return value is a list of lists — although our query only asked for one field, username, it could have asked for more, in which case each result in the result list would be a list with all the requested values. Once again, although unifyDB itself returns data in EDN format, client libraries will wrap that return value in whatever native data structure is convenient.

Let’s break that query down a bit. First, a bit of notation: any symbol that starts with a ? is called a variable, and is similar in spirit to a variable in a programming language. The query above has two major pieces: a :find clause and a :where clause. The :find clause is straightforward: it asks to find the value of the variable ?username. But how does it know what value that variable has? That’s where things get interesting.

Let's take a closer look at the :where clause:

:where [[?e :preferred-theme "light"]
        [?e :username ?username]]

It is a list of two relations - that is, expressions which assert some relationship between variables. The first relation, [?e :preferred-theme "light"], declares that there is some entity ?e whose :preferred-theme attribute has value "light". The second relation is slightly more abstract, declaring a relation between some entity ?e and the value of its :username attribute, which it assigns to the variable ?username.

Notice that both relations share a variable, ?e. This is where the magic happens! When two relations share a variable, they are said to unify. This means that the query engine finds all facts that satisfy both relations for some entity ?e. In other words, unifyDB will find all sets of facts such that the facts share an entity ?e, have one fact with attribute :preferred-theme and value "light", and have another fact with attribute :username and any value.

The result of this unification process is a set of variable bindings, calculated from the facts that satisfy the query relation. In our example, we find that the following set of facts satisfies the query relation:

(1, "username", "harry")
(1, "preferred-theme", "light")
(2, "username", "dumbledore")
(2, "preferred-theme", "light")

Unifying these facts with the variables in the :where clause yields the following set of bindings:

{
    ?e: 1,
    ?username: "harry"
},
{
    ?e: 2,
    ?username: "dumbledore"
}

Finally, since our :find clause asks only for the variable ?username, we look up that variable in the binding set, returning one result for each binding in the set:

[["harry"]
 ["dumbledore"]]

This unification approach to querying makes the database particularly powerful. Although in this example we unified on the entity ID, we can also unify on the attribute name, value, or some combination of all three. This gives unifyDB the ability to function as a document store (looking up the “documents”, i.e. entities, which have attributes and values matching some pattern); or as a column-oriented database, looking for all the values of a certain attribute or even all the attributes that have a certain value. Of course, most apps will use a combination of all these different querying approaches, letting the database work for them in whatever way they need for a particular feature.

In fact, this is only half of the query engine, since it also supports adding rules that let you compute new facts from existing facts in the database, but that is complex enough to warrant its own post.

There is a lot more I could write about here, but this is running kind of long so I’m going to leave it at this for now. You can follow the development of unifyDB on GitHub (the query engine is implemented here and unification is implemented here). If you are interested in this topic and want to dive into the implementation, I based my work on the excellent logical database engine in chapter 4.4 of Structure and Interpretation of Computer Programs.

As always, if you want to know more about unifyDB, have questions about this post or just want to geek out, hit me up on Twitter.

unifyDB Dev Diary 0: I’m building a database!

Sat, 8 Aug 2020 20:00:00 EDT

Phew, it’s been a while! Over a year, in fact. And what a wild year! Lots of good things happened: I got married, got a new job that I love, moved to a nice new apartment. Also some not-so-nice things, but since you are all living through 2020 just like me I don’t think I need to go into those. But I have still found some side-project time, and I’d like to start talking about what I’m building.

So – I’m excited to announce that I’m building a database! I’m calling it unifyDB. It’s going to be a general-purpose database with some pretty interesting properties:

It maintains a complete history of changes to all entities stored in the database
You can make queries for historical data, e.g. “what did this user record look like last Tuesday?”
Arbitrary metadata can be attached to transactions – for example, you can add an application user ID to every transaction your app makes
Fine-grained access control is built into the database, allowing developers to limit access to particular attributes across all entities

This is the database that I’ve always wanted – basically, I’m tired of being in meetings where the boss says “who changed this user’s email address?” and everyone just looks at each other and shrugs.

I’m designing unifyDB to be as modular as possible – I want it to be as easy to run it as a single node on your local machine as it is to run in an autoscaling cluster on your cloud of choice.

I’ve actually been working on this on and off for over a year. The code lives in a GitHub repository. Fair warning: it’s mostly undocumented and nowhere close to being finished. So far, I’ve written the query engine, the transaction handler, the web server (yes, it has an HTTP interface), and a bunch of underlying infrastructure. So as it currently stands, unifyDB is able to store data (in-memory since I haven’t built the storage layer yet) and issue history-aware queries. I’m in the middle of writing the authentication mechanism. After that, it’s on to the storage layer, then most likely the access control layer.

I’m going to start publishing monthly development diaries detailing the more interesting aspects of database. I’ll start with a post about the query system implementation sometime in the next couple of weeks. Sound interesting? Follow along on Feedly or your RSS reader of choice!.

In the meantime, if you want to know more about unifyDB or just want to geek out, hit me up on Twitter.

More than JSON: ActivityPub and JSON-LD

Mon, 22 Apr 2019 20:00:00 EDT

In which our hero discovers the power of normalization and JSON-LD

The problem with JSON

I’ve been doing a lot of research for my current side project, Pterotype. It’s a new kind of social network built as a WordPress plugin that respects your freedom, encourages choice, and interoperates with existing social networks through the power of ActivityPub. It’s undergone several iterations already – the beta has been out for a while now, and I’ve been working hard on a version 2 for the last several months.

One of the things I wasn’t satisfied with in the first version of Pterotype was the way it stores incoming data. ActivityPub messages are serialized in a dialect of JSON called JSON-LD. I didn’t really get JSON-LD when I started this project. It seems overcomplicated and confusing, and I was more interested in shipping something that worked than understanding the theoretical underpinnings of the federated web. So I just kept the incoming data in JSON format. This worked, sort of, but I kept running into annoying, hard-to-reason about situations. For example, consider this ActivityPub object, representing a new note that Sally published:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://example.org/activities/1",
  "type": "Create",
  "actor": {
    "type": "Person",
    "id": "https://example.org/sally",
    "name": "Sally"
  },
  "object": {
    "id": "https://example.org/notes/1",
    "type": "Note",
    "content": "This is a simple note"
  },
  "published": "2015-01-25T12:34:56Z"
}

The problem is that the above object, according to the ActivityPub specification, is semantically equivalent to this one:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://example.org/activities/1",
  "type": "Create",
  "actor": "https://example.org/sally",
  "object": "https://example.org/notes/1",
  "published": "2015-01-25T12:34:56Z"
}

This is the object graph in action – the actor and object properties are pointers to other objects, and as such they can either be JSON objects embedded within the Create activity, or URIs that dereference to the actual object (dereferencing is a fancy word for following the URI and replacing it with whatever JSON object is on the other side). Since I was representing these ActivityPub objects in this JSON format, that meant that whenever I saw an actor or object property, I always had to check whether it was an object or a URI and if it was a URI I had to dereference it to the proper object. This led to tons of annoying boilerplate and conditionals:

if ( is_string( $activity['object'] ) ) {
    $activity['object'] = dereference_object( $activity['object'] );
}

Yikes. So I came up with what I thought was a clever solution: just walk the object graph and dereference every URI I found whenever I saw a new JSON object. So I would receive Sally’s Create activity and traverse the JSON representation of its graph, dereferencing the actor and object objects in the process. This effectively turned the second representation above into the first one. Problem solved, right?

Well, not quite. There are actually a bunch of problems with that approach. First, not all URIs in the JSON object should be dereferenced. For example, there is an ActivityPub attribute called url that is – you guessed it – a URL! And it is supposed to stay a URL, not get dereferenced to some other thing. Okay, so I’ll only dereference URIs that belong to attributes I know should contain references to other objects – actor, object, etc. But there’s still a problem! There’s no guarantee that we’ll be able to successfully dereference a URI. Maybe the server that was hosting that object went down. Maybe there’s a temporary network failure. Maybe it’s the year 3000 and bitrot has taken down 80% of the internet. The point is, even if we preemptively dereference all the URIs we can, we still need to handle the case where we couldn’t access the actual object and are stuck with the URI. Which means we still need those stupid conditionals everywhere!

JSON-LD to the rescue

So what’s the actual solution for this? Well, as it turns out these were exactly the types of issues that JSON-LD is designed to solve. JSON-LD provides a way to normalize data into a standard form based on a /context/ that defines a schema for the data. Here’s the second version of Sally’s activity from above after undergoing JSON-LD expansion:

[
  {
    "https://www.w3.org/ns/activitystreams#actor": [
      {
        "@id": "https://example.org/sally"
      }
    ],
    "@id": "https://example.org/activities/1",
    "https://www.w3.org/ns/activitystreams#object": [
      {
        "@id": "https://example.org/notes/1"
      }
    ],
    "https://www.w3.org/ns/activitystreams#published": [
      {
        "@type": "http://www.w3.org/2001/XMLSchema#dateTime",
        "@value": "2015-01-25T12:34:56Z"
      }
    ],
    "@type": [
      "https://www.w3.org/ns/activitystreams#Create"
    ]
  }
]

So what’s up with those weird URL-looking attributes? And why has everything become an array?

The expansion algorithm has normalized the data into a form that is supposed to be universally normalized. The attributes – object, actor, etc. – have become URIs with a universal meaning and a known schema. In other words, any application that speaks JSON-LD knows what an https://www.w3.org/ns/activitystreams#actor is, even if they don’t know what an actor is.

Importantly for our purposes, take a look at what the object field has turned into. We went from:

"object": "https://example.org/notes/1"

To:

"https://www.w3.org/ns/activitystreams#object": [
  {
    "@id": "https://example.org/notes/1"
  }
]

Because the object attribute is specified in the ActivityStreams JSON-LD vocabulary to be of @type: @id, the expansion process was able to infer that object ought to be, well, an object. This neatly solves the problem of “is this string attribute actually a reference” – all references are clearly marked by their @id attributes now. Plus, this allows us to be smarter about when we dereference an object – for example, we can defer dereferencing until we actually need to access the attributes of the linked object. This approach also addresses the problem of network errors when dereferencing – if we can’t dereference, we just end up with an object that has only an @id, which can still be handled gracefully by the application.

Hopefully this gave some insight into the types of challenges involved with building ActivityPub-powered applications and the point of JSON-LD. Have questions? Did I do something wrong? Let me know in the comments or on the Fediverse!

ActivityPub: Good enough for jazz

Sun, 6 Jan 2019 19:00:00 EST

Kaniini, one of the lead developers of Pleroma, recently published a blog post called ActivityPub: The “Worse is Better” Approach to Federated Social Networking. It’s a critique of the security and safety of the ActivityPub protocol. They make some good points:

ActivityPub doesn’t support fine-grained access control checks, e.g. I want someone to be able to see my posts but not respond to them
Instances you’ve banned can still see threads from your instance in some ActivityPub implementations, because someone from a third instance replies to the thread and that reply reaches the banned instance

The post also generated an interesting Fediverse thread discussing the tradeoffs between proliferating the existing protocol versus making changes to it, and whether it would be possible to improve the protocol without breaking backward compatibility. It’s worth a read.

Here’s the thing: ActivityPub is a protocol, and protocols are only valuable as long as there is software out there actually using the protocol. At the end of the day, that’s the most important measure of success. Don’t get me wrong – protocols need to do the job they set out to do well. But at some point, the protocol works well enough that it becomes more important to foster adoption than to continue improving. I believe that ActivityPub has reached that point.

Now, I’m not suggesting that we stop development on the protocol. But future improvements to it should be iterative, building on the existing specification, and backward compatible whenever possible. For example, by all means let’s come up with a better access control model for ActivityPub – but we should also come up with a compatibility layer that assumes some default set of access capabilities for implementations that haven’t upgraded. This lets us move forward without leaving the protocol’s participants behind, preserving ActivityPub’s value.

We are in good company here. This model is exactly how HTTP became the protocol that powers the internet. If you have the time, check out this excellent (brief) history of the HTTP protocol. Here are the highlights: Tim Berners-Lee came up with HTTP 0.9, which was an extremely simple protocol that allowed clients to request a resource and receive a response. HTTP 1.0 added headers and a variety of other features. HTTP 1.1 added performance optimizations and fixed ambiguities in the 1.0 specification.

Critically, all of these versions of HTTP were similar enough that a server that supported HTTP 1.1 could trivially also support HTTP 1.0 and 0.9 (because 0.9 is actually a subset of 1.1). In fact, the Apache and Nginx web servers, which power most websites on the internet, still support HTTP 0.9! By designing and iterating on HTTP in a way that preserved backward compatibility, the early web pioneers were able to build a robust, performant, secure protocol while still encouraging global adoption.

If we want the Fediverse to be just as robust, performant, secure, and globally adopted, we should take the same approach.

Announcing Pterotype

Wed, 14 Nov 2018 19:00:00 EST

In my last post, I wrote about an emerging web standard called ActivityPub that lets web services interoperate and form a federated, open social network. I made an argument about how important this new standard is – how it tears down walled gardens, discourages monopolies and centralization, and encourages user freedom.

I genuinely believe what I wrote, too. And so, to put my money where my mouth is, I’m excited to announce Pterotype! It’s a WordPress plugin that gives your blog an ActivityPub feed so that it can take advantage of all the benefits ActivityPub has to offer.

Why WordPress?

My mission is to open up the entire internet. I want every website, every social network, and every blog to be a part of the Fediverse. And WordPress runs literally 30% of the internet. It’s not my favorite piece of software, and I certainly never expected to write any PHP, but the fact is that writing a WordPress plugin is the highest-impact way to grow the Fediverse the fastest.

So wait, what does this actually do?

Great question, glad you asked. Pterotype makes your blog look like a Mastodon/Pleroma/whatever account to users on those platforms. So, if you install Pterotype on your blog, Mastodon users will be able to search for blog@yourawesomesite.com in Mastodon and see your blog as if it was a Mastodon user. If they follow your blog within Mastodon (or Pleroma, or…), your new posts will show up in their home feed. This is what I meant in my last post about ActivityPub making sites first-class citizens in social networks – you don’t need a Mastodon account to make this work, and your content will show up in any service that implements ActivityPub without you needing an account on those platforms either.

Here’s what this blog looks like from Mastodon:

The plugin also syncs up comments between WordPress and the Fediverse. Replies from Mastodon et. al on your posts will show as WordPress comments, and comments from WordPress will show up as replies in the Fediverse. This is what I meant about tearing down walled gardens: people can comment on your blog posts using the platform of their choice, instead of being limited by the platform hosting the content.

Sounds amazing! Can I use it now?

Yes, with caveats. Pterotype is in early beta. The core features are in there – your blog will get a Fediverse profile, posts will federate, and comments will sync up – but it’s a pretty fiddly (and sometimes buggy) experience at the moment. If you do want to try it out, the plugin is in the plugin repository. If you install it on your blog, please consider signing up for the beta program as well – it’s how I’m collecting feedback and bug reports so I can make the plugin the best that it can be.

If you’d rather just follow my progress and dive in when it’s finished, that’s fine too! I made my development roadmap publicly available, and the plugin itself is open-source on GitHub. I’m currently doing a major refactor, pulling out all of the ActivityPub-related logic into its own library – once that’s done, it’ll be back to business as usual adding features and stability to Pterotype.

If you’ve read this far, and this project resonates with you, then you might be interested in becoming a sponsor on Patreon. Pterotype is free and open-source, so this is its only source of funding. For moment-to-moment updates, you can follow me on Mastodon.

See you on the Fediverse!

What is ActivityPub, and how will it change the internet?

Fri, 14 Sep 2018 20:00:00 EDT

A new kind of social network

There’s a new social network in town. It’s called Mastodon. You might have even heard of it. On the surface, Mastodon feels a lot like Twitter: you post “toots” up to 500 characters; you follow other users who say interesting things; you can favorite a toot or re-post it to your own followers. But Mastodon is different from Twitter in some fundamental ways. It offers many more ways for users to control the posts they see. It fosters awareness of the effect your posts have on others through a content warning system and encourages accessibility with captioned images. At its core, though, there’s a more fundamental difference from existing social networks: Mastodon isn’t controlled by a single corporation. Anyone can operate a Mastodon server, and users on any server can interact with users on any other Mastodon server.

This decentralized model is called federation. Email is a good analogy here: I can have a Gmail account and you can have an Outlook account, but we can still send mail to each other. In the same way, I can have an account on mastodon.technology, and you can have an account on mastodon.social, but we can still follow each other, like and re-post each other’s toots, and @mention each other. Just like Gmail servers know how to talk to Outlook servers, Mastodon servers know how to talk to other Mastodon servers (if you hear people talking about a Mastodon “instance”, they mean server). And just like Gmail and Outlook are controlled by different corporations, Mastodon servers are owned and operated by many different people and organizations. If you wanted, you could host your own Mastodon instance!

Why does this matter? It means that Mastodon users have choice about where they hang out online. If Twitter decides that your posts shouldn’t be on their platform, they can shut down your account and there’s nothing you can do about it (or conversely, if they decide your f-ed up content is totally fine, there’s nothing anyone else can do about it). On the other hand, if you disagree with the administrators of your Mastodon instance, you have the choice to move your account to another instance (switching providers, as it were) or to host your own instance if you’re willing to dedicate the time and effort.

The federated model also tends to align incentives better than centralized alternatives. Mastodon instances are usually run and moderated by members of the community that uses that particular Mastodon server – for example, I’m part of a community of tech folks over at mastodon.technology, and our server is administrated and moderated by a member of the community. He has a vested interest in making mastodon.technology a nice place to hang out since he hangs out there too. Contrast that with Twitter: Twitter is owned and operated by a massive, venture-backed, for-profit corporation. Now, I’m certainly not against companies making money (more on that later), but Twitter only cares about making Twitter a nice place to hang out to the extent that they profit by it, which has led them to make some user-unfriendly choices.

So Mastodon is pretty cool. But that’s not what really gets me excited. I’m excited about how Mastodon servers allow users on different instances to interact. It’s a protocol called ActivityPub, and it’s going to change the way the internet works.

ActivityPub

ActivityPub is a social networking protocol. Think of it as a language that describes social networks: the nouns are users and posts, and the verbs are like, follow, share, create… ActivityPub gives applications a shared vocabulary that they can use to communicate with each other. If a server implements ActivityPub, it can publish posts that any other server that implements ActivityPub knows how to share, like and reply to. It can also share, like, or reply to posts from other servers that speak ActivityPub on behalf of its users.

This is how Mastodon instances let users interact with users on other instances: because every Mastodon instance implements ActivityPub, one instance knows how to interpret a post published from another instance, how to like a post from another instance, how to follow a user from another instance, etc.

ActivityPub is much bigger than just Mastodon, though. It’s a language that any application can implement. For example, there’s a YouTube clone called PeerTube that also implements ActivityPub. Because it speaks the same language as Mastodon, a Mastodon user can follow a PeerTube user. If the PeerTube user posts a new video, it will show up in the Mastodon user’s feed. The Mastodon user can comment on the PeerTube video directly from Mastodon. Think about that for a second. Any app that implements ActivityPub becomes part of a massive social network, one that conserves user choice and tears down walled gardens. Imagine if you could log into Facebook and see posts from your friends on Instagram and Twitter, without needing an Instagram or Twitter account.

So here’s how ActivityPub is going to change the internet:

No more walled gardens

ActivityPub separates content from platform. Posts from one platform propagate to other platforms, and users don’t need to make separate accounts on every platform that they want to use. This has an additional benefit: since your ActivityPub identity (your Mastodon account, your PeerTube account, etc.) is valid across all ActivityPub-compliant applications, it serves as a much stronger identity signal, preventing malicious actors from impersonating you (e.g. creating a Twitter account in your name). If you can share one account across multiple platforms, no one can pretend to be you on those platforms – you are already there!

Social networking comes built-in

With traditional internet media, you need to rely on external services to share your work on social networks. If you want people to share your YouTube video around, you need to post it to Facebook or Twitter. But ActvityPub-enabled applications are social by nature. A PeerTube video can be shared or liked by default by users on Mastodon. A Plume blogger can build an audience on Mastodon or PeerTube without any additional effort since Mastodon and PeerTube users can follow Plume blogs natively. Users on all these platforms see content from the other apps on the platform of their choice. And Mastodon, PeerTube, and Plume are just the beginning – as more platforms begin implementing ActivityPub, the federated network grows exponentially.

Network effects that help users instead of harming them

“Network effects” leaves kind of a dirty taste in my mouth. It’s usually used as a euphemism for “vendor lock-in”; the reason that Facebook became such a giant was that everyone needed to be on Facebook to participate in Facebook’s network. However, ActivityPub flips this equation on its head. As more platforms become ActivityPub compliant, it becomes more valuable for platforms implement ActivityPub: more apps means more users on the federated network, more posts to read and share, and more choice for users. This network effect discourages vendor lock-in. In the end, the users win.

It’s going to be an uphill battle

I hope I’ve convinced you of the radical impact that ActivityPub could have on the internet. But there are some significant barriers preventing widespread adoption. The thorniest one is money.

Why is money an issue? Aren’t Mastodon and PeerTube free and open-source? Well, first of all, open source projects need funding too (that’s a big topic that deserves its own blog post, so I’m leaving it alone for now). The bigger issue right now is user adoption. The ActivityPub network is only viable if people use it, and to compete in any significant way with Facebook and Twitter we need a lot of people to use it. To compete with the big guys, we need big money. We need to be able to spread the word through marketing and blogging, to engineer new ActivityPub applications, and to support people working full-time on bringing about this revolution.

I know this isn’t necessarily a popular view in the open-source world, but I see funding as a critical priority to bring about the vision that ActivityPub promises. Unfortunately, it’s not clear how to obtain it.

All the major ActivityPub-compliant applications I’ve written about are open source projects, built and run by volunteers with tiny budgets. Traditional social networking companies like Twitter and Facebook are funded by selling advertisements on their platform. But in order to make any significant revenue from ads, you need a centralized audience whose attention you control. Facebook needs to be able to say, “we have X billion users; give us your money and we will show them your ads”. Plus, the big social companies extract value from their users by segmenting them based on their behavior and interests, enabling micro-targeted ad campaigns.

None of that is possible when the users and content are spread across many servers and platforms. There is no centralized audience to segment and advertise to. We’ll need to rethink the fundamental business model of social networking if we want ActivityPub to take off.

That being said, I do think ActivityPub offers tremendous business value. It turns your corporate blog into a social network by offering native sharing, following, liking, and replying. And it does so on your customer’s terms, which not only prevents abusive, spammy content but also helps your company’s reputation with users and potential customers. These benefits are valuable, and I think there is a way to turn that value into funding.

It’s important to think about how to make this revolution happen. ActivityPub has the potential to change the way we think and act on the internet, in a way that encourages decentralization and puts users first again. That’s a vision worth fighting for.

A DSL for music

Sat, 4 Aug 2018 20:00:00 EDT

Haskell School of Music

I recently discovered Haskell School of Music. It’s a book about algorithmic music, which is awesome because: a) I’ve been obsessed with procedural generation for years and b) I like music as much as I like programming. So you can imagine my excitement when I discovered that someone had written a textbook combining my favorite areas of study.

Haskell School of Music is aimed at intermediate-level CS students, so it covers a lot of the basics of functional programming. It aims to be an introduction to the Haskell programming language while also thoroughly examining computer music. It starts simply by defining the data structures that represent music, and progresses to functional programming concepts, procedurally generating music, and doing signal processing and MIDI interfacing to actually play the songs.

I like Haskell, but I want to write music in Clojure. Why? First of all, because it’s the One True Language (it’s fine if you disagree with me – your opinion is valid even if it’s objectively incorrect). But more importantly, Clojure excels as an environment for writing domain-specific languages (DSLs). And as it turns out, writing algorithmic music using a DSL is a major win. Not only are DSLs expressive enough to portray creative expression, but DSLs written in Lisps are inherently extensible – whereas Haskell’s static typing adds barriers to extensibility. There’s also an excellent music synthesis library for Clojure, Overtone, that I want to be able to take advantage of.

Before we can explore what a DSL for music would look like, we need to understand how HSoM represents music as data.

Music as data

HSoM breaks music down into its component pieces. It represents music using Haskell data structures:

type Octave = Int
data PitchClass = Cff | Cf | C | Dff -- ...etc, all the way to Bss
type Pitch = (PitchClass, Octave)
type Duration = Rational
data Primitive a = Note Duration a
                 | Rest Duration
data Control =
  Tempo Rational
  Transpose Int
  -- ...a bunch more
data Music a =
  Prim (Primitive a)
  | Music a :+: Music a
  | Music a :=: Music a
  | Modify Control (Music a)

Many of these type declarations are straightfoward, but a couple bear further discussion. A PitchClass is an algebraic data type representing all of the pitches: C#, Ab, F, and so on. By pairing a pitch class with an octave, we get a Pitch, which represents a specific note (for instance, middle C would be (C, 4). A Primitive is a basic music building block, either a note or a rest. Note that it is polymorphic: this is so that we can define types like Note Duration Pitch but also types like Note Duration (Pitch, Loudness) so that we can attach additional data to a primitive if we need to. A control represents the concept of making a modification to some music by changing the tempo, transposing it, or otherwise changing the output while keeping the underlying notes the same.

The Music type is where things get really interesting. It’s an algebraic data type representing the concept of music in general. In fact, it’s powerful enough to fully represent any piece of music, from Hozier to Bach. A Music value is one of four possible types: a Prim, which is either a note or a rest; a Modify, which takes another Music as an argument and modifies it in some way; the :+: infix constructor, which represents two separate Music values played sequentially; and the :=: infix constructor, which represents two separate Music values played simultaneously.

The Music type has some important properties. First, it’s polymorphic for the same reason that the Primitive type is. This allows us to attach any type of data we want to music primitives, letting us express any musical concept (volume, gain, you name it).

Second, three out of its four constructors are recursive – they take other Music values as arguments. This is the key that makes the data model so powerful. It allows you to model arbitrary configurations of notes, e.g. Note 1/4 (C 4) :=: Note 1/4 (E 4) :=: Note 1/4 (G 4) is a C major triad, and that expression evaulates to a Music value that can itself be passed to Modify, :+:, or :=: to weave it into a larger piece of music.

The result is an extraordinarily concise definition that still manages to encompass all possible pieces of music. Using these data structures, we can describe any song we can imagine.

But as powerful as this data type is, I wouldn’t call it a domain-specific language. The static type system makes it inflexible: how would you combine a Note 1/8 ((C 4) 8), representing a note with pitch and loudness, with a Note 1/8 (E 4), representing a note with just a pitch? Sure, you could write a function to convert from one to the other, but at that point you’ve lost elegance and flexibility.

Here’s where Clojure comes in.

A DSL for music with Clojure

What would a domain-specific language for music look like in Clojure? I found inspiration in the HTML templating library Hiccup. Hiccup represents HTML documents (a graph of complex nested nodes, just like music values) using Clojure vectors, like so:

[:div {:class "foo"} [:p "foo"]]

The Hiccup vectors are actually a DSL that can describe arbitrary HTML markup. Anything that can be expressed in HTML can be expressed using Hiccup. It straddles the line between data and code – the vector is flexible and expressive enough to represent any web page, but can be manipulated using standard Clojure library functions.

If we apply this idea to the data structure from HSoM, we end up with something like this:

;; notes and rests are maps
(def eighth-note-c
  {:duration 1/8
   :pitch [:C 4]})

(def eighth-note-e
  {:duration 1/8
   :pitch [:E 4]})

(def eighth-note-rest
  {:duration 1/8})

;; simultaneous music values
[:= eighth-note-c eighth-note-e]

;; sequential music values
[:+ eighth-note-c eighth-note-rest eighth-note-e]

;; modifying music values
[:modify
 {:tempo 120      ;; the control
  :transpose 3}
 {:duration 1/8   ;; the note
  :pitch [C 4]}]

;; :=, :+, and :modify can operate on any music value,
;; including arbitrary nesting
[:modify {:tempo 120}
  [:+
    [:= {:duration 1/4
         :pitch [D 4]}
      {:duration 1/4
       :pitch [F 4]}]
    [:= {:duration 1/4
         :pitch [C 4]}
      {:duration 1/4
       :pitch [E 4]}]]]

At first glance, this looks the same as the Haskell data types from HSoM. Both representations represent notes with pitch and duration; both use the :modify, := and :+ operators to compose music; both support recursive composition of any depth.

But the Clojure version is actually more expressive and flexible than the Haskell equivalent. A note can have any metadata we want attached:

{:duration 1/4
 :pitch [:C 4]
 :loudness 6}

Our DSL has no problem composing notes with differing metadata:

[:+
 {:duration 1/4
  :pitch [:C 4]
  :loudness 6}
  :pitch [:Eb 4]
  :staccato true}]

Furthermore, because Clojure is dynamically typed and supports duck typing via map keywords, we can write functions that operate on all notes and music values, even those with unexpected metadata.

Like the Hiccup vectors, our music vectors blur the boundary between a DSL and a data structure. The vectors are expressive enough to represent any musical concept, but can still be passed around and operated on by normal Clojure functions. As an added advantage, the vectors look similar enough to the HSoM data structures that I can easily follow along with the textbook using Clojure and Overtone.

What's next

So I have a way to represent music in Clojure now. What’s next? Haskell School of Music ships with a library called Euterpea that knows how to turn the Music data structure into actual sound. So the next step for me is probably porting something like that to Clojure. I’m hoping to offload most of that work to Overtone. After that, I’ll explore algorithmic composition using the techniques outlined in HSoM. Stay tuned!