Weather Text: My Own Bespoke Weather App #

I was a happy user of the Dark Sky weather app for many years. Even more than the localized and timely notifications (I live in a place with predictable weather) I appreciated its Apple Watch complications. I specifically used the three-line textual summary shown in the middle of the Modular face.

Apple Watch Modular face with Dark Sky complication
My preferred Apple Watch face, circa 2022

Apple acquired Dark Sky in 2020, incorporated many of its features into the built-in Weather app (with iOS 16/watchOS 10), and shut down the original app at the end of 2022. Unfortunately the complication layout was not carried over – the only large complications are multi-hour/day graphs that have too much information and are hard to see at a glance. On my iPhone, I first switched to Weather Line (RIP) and later to Weather Strip, but neither had a watch app. I considered CARROT Weather, which has customizable complications, as a possible replacement. However, it couldn’t replicate the same layout, and paying for a separate subscription just for a watch app seemed wasteful.

On the other hand my time is free, and I figured that with WeatherKit (the one good thing to come out of the Dark Sky acquisition), I could build my own complication that showed exactly what I wanted. I procrastinated doing this for more than a year (squinting at the Weather complication every morning and thinking “I should do something about this”) but I reached a lull in Infinite Mac development and decided to once again work on an Apple platform developed in this century. It turns out that my procrastination paid off: watchOS 9 introduced WidgetKit for complications and Xcode 15 made wireless debugging (the only option for Apple Watch development) more reliable.

After fighting a bit with my old nemesis, provisioning profiles, and figuring out how to get location access in a widget (which may change in the future), I had something up and running. WeatherKit makes it particularly easy to replicate the Dark Sky complication – there’s even a property to get an SF Symbol for the current conditions and pre-computed sunrise/sunset times

Weather Text is the resulting app - it has a minimal watch UI to request location access and show a preview, but the main focus is the complication/widget (which also shows up in Smart Stack).

Weather Text app screen Apple Watch Modular face with Weather Text complication

I decided not to put the app on the App Store. WeatherKit has a limited free quota, and while it was unlikely that the app would suddenly boom in popularity, I didn’t want to worry about it. I could have made the app paid up-front to deter casual installs, but my App Store account is not set up for payments, and I didn’t want to go through the hassle of that.

The app is instead available via TestFlight, which is hopefully enough of a deterrent that I’ll remain under the free limit (and builds expiring every 90 days also gives me additional control). I would have thought that only going through the lightweight TestFlight approval would exempt me from review shenanigans. However, I still had to change my original icon, despite the presence of plenty of other dark background icons, including the original Dark Sky one.

Weather Text is a home cooked meal/hammer kind of app, mostly meant for me (though I’m not the only one who misses Dark Sky’s complications). I don’t have grand ambitions for it, it’s not a forever project (modulo minimal upkeep to make sure it works with new OS releases). Though modern platforms don’t have quite as much end-user programmability as they used to (though widget.json/Widget Construction Set and Scriptable look interesting), as a middle-aged gentleman programmer it’s nice to still be able to make them fit exactly to my needs.

How I Consume Mastodon #

tl;dr: I use Masto Feeder to generate a nicely-formatted RSS feed of my timeline, and then I can read it in my preferred feed reader.

A bit more than 10 years ago I wrote a post called “How I Consume Twitter”. It described how I contort Twitter to my completionist tendencies by generating RSS¹ feeds for my timeline (for more fine-grained updates) and lists (to read some accounts in a digest). That allowed me to treat Twitter as just another set of feeds in my feed reader, have read state, not need another app, and all the other benefits that RSS provides.

A bunch of things have changed since then: (lots of) Twitter drama, product changes and feed reader churn (I’m currently using NetNewsWire), but the Bird Feeder and Tweet Digest tools in Stream Spigot have continued to serve me well, with the occasional tweaks.

When the people I follow started their migration to Mastodon in early November, I initially relied on Mastodon’s built-in RSS feeds for any user. While that worked as a stopgap solution until I decided what instance to use², it proved unsatisfying almost immediately. The feeds are very bare bones (no rendering of images, polls, or other fancier features) and do not include boosts. Additionally, having potentially hundreds of feeds to crawl independently seemed wasteful, and would require manual management to track people if they ever moved servers.

I saw that there was a Mastodon feature request for a single feed for the entire timeline, but it didn’t seem to getting any traction. After a couple weeks of waffling, I decided to fix it for myself. The Stream Spigot set of tools I had set up for Twitter feed generation were pretty easily adaptable³ to also handle Mastodon’s API. Over the next few weeks I added boost, spoiler/CW, and poll rendering, as well as list and digest support.

The end result is available at Masto Feeder, a tool to generate RSS feeds from your Mastodon timeline and lists. It can generate both one-item-per-post as well as once-a-day digests. It’s been working well for me for the past few weeks, and should be usable with any Mastodon instance (or anything else that implements its API). The main thing that would make it better is exclusive lists, but there’s some hope of that happening.

Mastodon viewed as a feed in NetNewsWire

As for Twitter, with the upcoming removal of all free API access it likely means the end for Bird Feeder and Tweet Digest. What this means for me is that I’ll most likely stop reading Twitter altogether — I certainly have no interest in using the first-party client with its algorithmic timeline. However, I’ve been enjoying Mastodon more lately anyway ( should find me), and with being able to consume⁴ it in my feed reader, it’s truly more than a 1:1 replacement for Twitter.

Update on June 5, 2023: Twitter API access for Bird Feeder and Tweet Digest was eventually revoked on May 22.

  1. Technically Atom feeds, but I’ve decided to use the term RSS generically since it’s not 2004 anymore.
  2. My dilly-dallying until I settled on mean that someone else got @mihai a few days before I signed up.
  3. It did pain me to have to write a bunch more Python 2.7 code that will eventually be unsupported, but that’s a future Mihai problem.
  4. For posting I end up using the built-in web UI on desktop, or Ivory on my iPhone and iPad. Opener is also handy when I want to open a specific post from NetNewsWire in Ivory (e.g. to reply to it).

Solving Bee: An Augmented Reality Tool for Spelling Bee #

Like many others I’ve spent a lot of time over the past year playing the New York Times’ Spelling Bee puzzle. For those that are not familiar with it, it’s a word game where you’re tasked with finding as many words as possible that can be spelled with the given 7 letters, with the center letter being required. I have by no means mastered it — there are days when getting to the “Genius” ranking proves hard. Especially at those times, I’ve idly thought about how trivial it would be to make a cheating program that applies a simple regular expression through a word list (or even uses something already made). However, that seemed both crude and tedious (entering 7 whole letters by hand).

When thinking of what the ideal bespoke tool for solving Spelling Bee would be, apps like Photomath or various Sudoku solvers came to mind — I would want to point my phone at the Spelling Bee puzzle and get hints for what words to look for, with minimal work on my part. Building such an app seemed like a fun way to play around with the Vision and Core ML frameworks that have appeared in recent iOS releases. Over the course of the past few months I’ve built exactly that, and if you’d like to take it for a spin, it’s available in the App Store. Here’s a short demo video:

Object Detection

The first step was to be able to detect a Spelling Bee “board” using the camera. As it turns out, there are two versions of Spelling Bee, the print and digital editions. Though they are basically the same game, the print one has a simpler display. I ended up creating a Core ML model that had training data with both, with distinct labels (I relied on Jason to send me some pictures of the print version, not being a print subscriber myself). Knowing which version was detected was useful because the print version only accepts 5-letter words, while the digital one allows 4-letter ones.

To create the model, I used RectLabel to annotate images, and Create ML to generate the model. Apple has some sample code for object detection that has the scaffolding for setting up the AVCaptureSession and using the model to get VNRecognizedObjectObservations. The model ended up being surprisingly large (64MB), which was the bulk of the app binary size. I ended up quantizing it to fp16 to halve its size, but even more reduction may be possible.

Print edition of Spelling Bee Digital edition of Spelling Bee
Print edition Digital edition

Text Extraction

Now that I knew where in the image the board was, the next task was to extract the letters in it. The vision framework has functionality for this too, and there’s also a sample project. However, when I ran a VNRecognizeTextRequest on the image, I was getting very few matches. My guess was that this was due to widely-spaced individual letters being the input, instead of whole words, which makes the job of the text detector much harder. It looked like others had come to the same conclusion.

I was resigned to having to do more manual letter extraction (perhaps by training a separate object detection/recognition model that could look for letters), when I happened to try Apple’s document scanning framework on my input. That uses the higher-level VNDocumentCameraViewController API, and it appeared to be able to find all of the letters. Looking at the image that it generated, it looked like it was doing some pre-processing (to increase contrast) before doing text extraction. I added a simple Core Image filter that turned the board image into a simple black-and-white version and then I was able to get much better text extraction results.

Captured image of Spelling Bee Simplified image of Spelling Bee
Captured board image Processed and simplified board image

The only letter that was still giving me trouble was “I”. Presumably that’s because a standalone capital "I" looks like a nondescript rectangle, and is not obviously a letter. For this I did end up creating a simple separate object recognition model that augments the text extraction result. I trained with images extracted from the processing pipeline, using the somewhat obscure option to expose the app’s Documents directory for syncing via iTunes/finder. This recognizer can be run in parallel with the VNRecognizeTextRequest, and the results from both are combined.

Board Letter Detection

I now had the letters (and their bounding boxes), but I still needed to know which one was the center (required) letter. Though probably overkill for this, I ended up converting the centers of each of the bounding boxes to polar coordinates, and finding those that were close to the expected location of each letter. This also gave me a rough progress/confidence metric — I would only consider a board’s letter fully extracted if I had the same letters in the same positions from a few separate frames.

Polar coordinates of Spelling Bee

Dictionary Word Lookup

Once I knew what the puzzle input was, the next step was to generate the possible words that satisfied it. Jason had helpfully generated all possible solutions, but that was for the print version, which did not support 4-letter words. I ended up doing on-device solution generation via a linear scan of a word list — iOS devices are fast enough and the problem is constrained enough that pre-generation was not needed.

One of the challenges was determining what a valid word is. The New York Times describes Spelling Bee as using “common” words, but does not provide a dictionary. The /usr/share/dict/words list which is commonly used for this sort of thing is based on an out-of-copyright dictionary from 1934, which would not have more recent words. I ended up using the 1/3 million most frequent words from the Google Web Trillion Word Corpus, with some filtering. This had the advantage of sorting the words by their frequency of use, making the word list ascend in difficulty. This list does end up with some proper nouns, so there's no guarantee that all presented words are acceptable as solutions, but it was good enough.

Word Definition Display

To make the app more of a “helper”, I decided to not immediately display the word list, but to have a “clue” in the form of each word’s definitions. iOS has a little-known helper for displaying word definitions - UIReferenceLibraryViewController. While this does display the definition of most words, it doesn’t allow any customization of the display, and I wanted to hide the actual word.

Word list of Spelling Bee Word definition in Spelling Bee
Word list Definition (with word hidden)

It turns out it’s implemented via a WKWebView, and thus it’s possible to inject a small snippet of JavaScript to hide and show definition. The whole point of this project had been to learn something different from the “hybrid app with web views” world that I inhabit at Quip, but sometimes you just can’t escape the web views.


Now that I had the core functionality working end-to-end, there were still a bunch of finishing touches needed to make it into an “app” as opposed to a tech demo. I ended up adding a slash screen, a “reticle” to make the scanning UI more obvious, and a progress display to show the letters that have been recognized so far.

This was a chance to experiment with SwiftUI. While it was definitely an improvement over auto-layout or Interface Builder, I was still disappointed by the quality of the tooling (Xcode previews would often stop refreshing, even for my very simple project) and the many missing pieces when it comes to integrating with other iOS technologies.

Getting it into the App Store

Despite being a long-time iOS user and developer, this was my first time submitting one of my own apps to the App Store. The technical side was pretty straightforward — I did not encounter any issues with code signing, provisioning profiles or other such things that have haunted Apple platform developers for the past decade. Within a day, I was able to get a TestFlight build out.

However, actually getting the app approved for the App Store was more of an ordeal. I initially got contradictory rejections from Apple (how can app both duplicate another and not have “enough” functionality) and all interactions were handled via canned responses that were not helpful. I ended up having to submit an appeal to the App Review Board to get constructive feedback, after which the app was approved without further issues. I understand the App Store is appealing target for scammers, but having to spend so much reviewer bandwidth on a free, very niche-y app does not seem like a great use of limited resources.

Peeking Inside

If you’d like to take a look to see how the app is implemented, the source is available on GitHub.

In-Product Debugging Tools #

I wrote a post on the Quip blog about the various in-product debugging tools that we've developed over the years. It's been very satisfying to make and use our tools over the years, I'm glad we're finally sharing some details about them.

Quip Editor Debugging Overlay
One of the many overlays we've created

Grafting Local Static Resources onto Production #

tl;dr: Using the webRequest Chrome extension API it is possible to “graft” development/localhost JavaScript and CSS assets on a production web service, thus allowing rapid debugging iteration against real production data sets. Demo site and extension.

During the summer of 2015 I was investigating an annoying bug in Quip where our message list would not stay “bottom-anchored” in some circumstances¹. Unfortunately I was only able to trigger it on our live production site, not on my local development setup. Though Chrome’s developer tools are quite nice, I did not have the necessary ability to rapidly iterate on the code in order to further investigate the bug. I had in the past pushed alternate builds our staging site to debug such production-only issues, but that would still take several minutes to see the results of every change.

My next thought was that I could instead try to reproduce the bug in our soon-to-be-released desktop app. The app can use local (minimally processed) JavaScript while running against production data. Unfortunately the bug did not manifest itself in our Mac app. I chalked this up to rendering engine differences (the bug was only visible in Chrome, and our Mac app uses a WebKit-backed WebView). I then tried our Windows app (which uses the same rendering engine as Chrome via the Chromium Embedded Framework), but it didn’t happen there it either. I was forced to conclude that the bug was due to some specific behavior in our website when running against production data, not something in the shared React-based UI.

As I was wishing for a way to use JavaScript and CSS from my laptop with production data (for security reasons my local Quip server cannot connect to the production databases) I remembered that Gmail used to have exactly such a mode. As I recall it, you could start a local CaribouGmail server, go to your (work) Gmail instance and append a special URL parameter that would cause the JavaScript from the local server to be requested instead². With most of Gmail’s behavior being driven by the client-side JavaScript (with the server serving as an API endpoint) this meant that it was possible to try out pretty complex changes on your own data without having to “deploy” them.

I considered adding this mode to Quip, but that seemed scary, security-wise, since it was effectively intentional cross-site scripting. It also would have meant waiting for the next day’s production push (and I wanted to solve the problem as soon as possible). However, it then occurred to me that I didn’t actually need to have the server change it behavior; I could instead write a Chrome extension which (via the webRequest API) would “graft” the local JavaScript and CSS files from my local server onto the production site when loaded in my browser.

I had hoped that the extension could modify the HTML that is initially served and replace the JavaScript and CSS URLs, but it turns out the webRequest API cannot modify the HTTP response body. What did work was to intercept the JavaScript and CSS requests before they were sent to our CDN and redirect them to paths on my local server. Chrome would initially flag this as being insecure (since we use HTTPS in production, and the redirected URLs were over plain HTTP), but it is possible to convince it to load the resources anyway.

Once I had the necessary tooling and ability to iterate quickly, fixing the bug that prompted all this was pretty straightforward (it was caused by the “mount point” system that we used to incrementally migrate our website to React, but that’s a whole other blog post). Since then it’s come in handy in debugging other hard-to-recreate problems, and for measuring JavaScript performance against more realistic data. It did briefly break when we added a Content Security Policy (CSP) — since we were loading scripts from an unknown domain the browser was correctly blocking the “grafted” response. However, the webRequest API also allows the extension to edit the response headers, thus it was straightforward to have it intercept the main HTML page request and strip the CSP header.

The extension that I wrote to accomplish this is very barebones and hardcodes a bunch of Quip-specific logic and URLs, thus is not easily shared. However, I have recreated a simplified version of it and put it in my web experiments repository. There is also a demo site that it can be applied to.

  1. Yet more developer time spent faking something that should be a built-in capability, further confirming Bret’s observation.
  2. For a bit more history: back in 2005 I was using Greasemonkey to hack Gmail left and right. When I talked to the Gmail team about this approach (versus working in my 20% time to add those features to Gmail directly) I rationalized it as “Greasemonkey lets me do UI experiments on the real email in my account with minimal lag, instead of needing to wait for code reviews and production pushes.” Darick Tong (a Gmail engineer) took this feedback to heart and added the custom JavaScript mode. Unfortunately by that point I had mostly moved on from Gmail hacking (Reader was keeping me plenty busy, JavaScript-wise), so I never got to actually use it.

Teaching the Closure Compiler About React #

tl;dr: react-closure-compiler is a project that contains a custom Closure Compiler pass that understands React concepts like components, elements and mixins. It allows you to get type-aware checks within your components and compile React itself alongside your code with full minification.

Late last year, Quip started a gradual migration to React for our web UI (incidentally the chat features that were launched recently represent the first major functionality to be done entirely using React). When I started my research into the feasibility of using React, one of my questions was “Does it work with the Closure Compiler?” At Quip we rely heavily on it not just for minification, but also for type annotations to make refactorings less scary and code more self-documenting¹, and for its many warnings to prevent other gotchas in JavaScript development. The tidbits that I found were encouraging, though a bit sparse:

  • An externs file with type declarations for most of React's API²
  • A Quora post by Pete Hunt (a React core contributor) describing React as “closure compiler compatible”
  • React's documentation about refs mentions making sure to quote refs annotated via string attributes³

In general I got the impression that it was certainly possible to use React with the Closure Compiler, but that not a lot of people were, and thus I would be off the beaten path⁴.

My first attempt was to add react.js (the unminified version) as source input along with a simple “hello world” component⁵. The rationale behind doing it this way was that, if React was to be a core library, it should be included in the main JavaScript bundle that we serve to our users, instead of being a separate file. It also wouldn't need an externs file, since the compiler was aware of it. Finally, since it was going to be minified with the rest of our code, I could use the non-minified version as the input, and get better error messages. I was then greeted by hundreds of errors and warnings which broadly fell into three categories:

  1. “illegal use of unknown JSDoc tag providesModule” and similar warnings about JSDoc tags that the React source uses that the Closure Compiler didn't understand
  2. “variable React is undeclared” indicating that the Closure compiler did not realize what symbols react.js exported, most likely because the module wrapper that it uses is a bit convoluted, and thus it's not obvious that the exported symbols are in the global scope
  3. “dangerous use of the global this object” within my component methods, since the Closure Compiler did not realize that the functions within the spec passed to React.createClass were going to be run as methods on the component instance.

Since I was still in a prototyping stage with React, I looked into the most minimal set of changes I could do to deal with these issues. For 2, adding the externs file to our list helped, since the compiler now knew that there was a React symbol and its many properties. This did seem somewhat wrong, since the React source was not actually external, and it was in fact safe to (globally) rename createClass and other methods, but it did quieten those errors. For 1 and 3 I wrote a small custom warnings guard that ignored all “errors” in the React source itself and the “dangerous use of global thiswarning in .jsx files.

Once I did all that, the code compiled, and appeared to run fine with all the other warnings and optimizations that we had. However, a few days later, as I was working on a more complex component, I ran into another error. Given:

var Comp = React.createClass({
    render: function() {...},
    someComponentMethod: function() {...}
var compInstance = React.render(React.createElement(Comp), ...);

I was told that someComponentMethod was not a known property on compInstance (which was of type React.ReactComponent — per the externs file). This once again boiled down to the compiler not understanding that the React.createClass construct (i.e. that it defined a type). It looked like I had two options for dealing with this:

  1. Add a @suppress {missingProperties} annotation at the callsite, so that the compiler wouldn't complain about the property that it didn't know about
  2. Add a @lends {React.ReactComponent.prototype} annotation to the class spec, so that the compiler would know that someComponentMethod was indeed a method on components (this seemed to be the approach taken by some other code I came across).

The main problem with 2 is that it then told the compiler that all component instances had a someComponentMethod method, which was not true. However, it seemed like the best option, so I added it and kept writing more components.

After a few more weeks, when more engineers started to write React code, these limitations started to chafe a bit. There was both the problem of having to teach others about how to handle sometimes cryptic error messages (@lends is not a frequently-encountered JSDoc tag), as well as genuine bugs that were missed because the compiler did not have a good enough understanding of the code patterns to flag them. Additionally, the externs file didn't quite match with the latest terminology (e.g. React.render's signature had it both taking and returning a ReactComponent). Finally, the use of an externs file meant that none of the React API calls were getting renamed, which was adding some bloat to our JavaScript.

After thinking about these limitations for a while, I began to explore the possibility of creating a custom Closure Compiler pass that would teach it about components, mixins, and other React concepts. It already had a custom pass that remapped goog.defineClass calls to class definitions, so teaching it about React.createClass didn't seem like too much of a stretch.

Fast forward a few weeks (and a baby) later, and react-closure-compiler is a GitHub project that implements this custom pass. It takes constructs of the form:

var Comp = React.createClass({
    render: function() {...},
    someComponentMethod: function() {...}

And transforms it to (before any of the normal compiler checks or type information was extracted):

 * @interface
 * @extends {ReactComponent}
function CompInterface() {}
CompInterface.prototype = {
    render: function() {},
    otherMethod: function() {}
/** @typedef {CompInterface} */
var Comp = React.createClass({
    /** @this {Comp} */
    render: function() {...},
    /** @this {Comp} */
    otherMethod: function() {...}
/** @typedef {ReactElement.<Comp>} */
var CompElement;

Things of note in the transformed code:

  • The CompInterface type is necessary in order to teach the compiler about all the methods that are present on the component. Having it as an @interface means that no extra code ends up being generated (and the existing code is left untouched). The methods in the interface are just stubs — they have the same parameters (and JSDoc is copied over, if any), but the body is empty.
  • The @typedef is added to the component variable so that user-authored code can treat that as the type (the interface is an implementation detail).
  • The @this annotations that are automatically added to all component methods means that the compiler understands that those functions do not run in the global scope.
  • The CompElement @typedef is designed to make adding types to elements for that component less verbose.

A bit more formally, these are the types that the compiler knows about given the Comp definition:

  • ReactClass.<Comp>, for the class definition
  • ReactElement.<Comp> for an element created from that definition (via JSX or React.createElement())
  • Comp for rendered instances of this component (this is subclass of ReactComponent).

This means that, for example, you can use {Comp} to as a @return, @param or @type annotation for functions that operate on rendered instances of Comp. Additionally, React.render invocations on JSX tags or explicit React.createElement calls are automatically annotated with the correct type.

To teach the compiler about the React API, I ended up having a types.js file with the full API definition (teaching the compiler how to parse the module boilerplate seemed too complex, and in any case the React code does not have type annotations for everything). For the actual type hierarchy, in addition to looking at the terminology in the React source itself, I also drew on the TypeScript and Flow type definitions for React. Note that this is not an externs file, it's injected into the React source itself (since it's inert, it does not result in any output changes). This means that all React API calls can be renamed (with the exception of React.createElement, which cannot be renamed due to the collision with the createElement DOM API that's in another externs file).

Having done the basics, I then turned to mixins (one of the reasons why we're not using ES6 class syntax for components). I ended up requiring that mixins be wrapped in a React.createMixin(...) call, which was introduced with React 0.13 (though it's not documented). This means that it's possible to cheaply understand mixins: [SomeMixin] declarations in the compiler pass without having to do more complex source analysis.

The README covers more of the uses and gotchas, but the summary is that Quip itself is using this compiler pass to pre-process all our client-side code. The process of converting our 400+ components (from the externs type annotations) took a couple of days (which included tweaks to the pass itself, as well as fixing a few bugs that the extra checks uncovered).

The nice thing about having custom code in the compiler is that it provides an easy point to inject more React-specific behavior. For example, we're heavy users of propTypes, but they're only useful when using the non-minified version of React — propTypes are not checked in minified production builds. The compiler pass can thus strip them if compiling with the minified version.

Flow was the obvious alternative to consider if we wanted static type checking that was React-aware. I also more recently came across Typed React. However, extending the Closure Compiler allows us to benefit from the hundreds of other (non-React) source files that have Closure Compiler type annotations. Additionally, the compiler is not just a checker, it is also a minifier, and some minification passes rely on type information, thus it is beneficial to have type information accessible to the compiler. One discovery that I made while working on this project is that the compiler has a pass that converts type expressions to JSDoc, and generally seems to have some understanding of type expressions that (at least superficially) resemble Flow's and TypeScript's. It would be nice to have one type annotated codebase that all three toolchains could be run on, but I think that's a significant undertaking at this point.

If you use React and the Closure Compiler together, please give the pass a try (it integrates with Plovr easily, and can otherwise be registered programatically) and let me know how it works out for you.

  1. I continue to find doing large-scale refactorings less scary in our client-side code than ones in our server-side Python code, despite better test coverage in the latter environment.
  2. I ended up contributing to it a bit, as we started to use less common React APIs.
  3. Spelunking through React's codebase that I did much later turned up keyOf and many other indicators that React was definitely developed with unquoted property renaming minification in mind.
  4. Indeed the original creator of the React externs file has indicated that he's no longer using the combination of React/Closure Compiler.
  5. Which used JSX, but that was not of interest to the Closure Compiler: it was transformed to plain JavaScript before the compiler saw it.

RetroGit #

tl;dr: RetroGit is a simple tool that sends you a daily (or weekly) digest of your GitHub commits from years past. Use it as a nostalgia trip or to remind you of TODOs that you never quite got around to cleaning up. Think of it as Timehop for your codebase.

It's now been a bit more than two years since I've joined Quip. I recall a sense of liberation the first few months as we were working in a very small, very new codebase. Compared with the much older and larger projects at Google, experimentation was expected, technical debt was non-existent, and in any case it seemed quite likely that almost everything would be rewritten before any real users saw it¹. It was also possible to skim every commit and generally have a sense that you could keep the whole project in your head.

As time passed, more and more code was written, prototypes were replaced with “productionized” systems and whole new areas that I was less familiar with (e.g. Android) were added. After about a year, I started to have the experience, familiar to any developer working on a large codebase for a while, of running blame on a file and being surprised by seeing my own name next to foreign-looking lines of code.

Generally, it seemed like the codebase was still manageable when working in a single area. Problems with keeping it all in my head appeared when doing context switches: working on tables for a month, switching to annotations for a couple of months, and then trying to get back into tables. By that point tables had been “swapped out” and it all felt a bit alien. Extrapolating from that, it seemed like coming back to a module a year later would effectively mean starting from scratch.

I wondered if I could build a tool to help me keep more of the codebase “paged in”. I've been a fan of Timehop for a while, back to the days when they were known as 4SquareAnd7YearsAgo. Besides the nostalgia factor, it did seem like periodic reminders of places I've gone to helped to keep those memories fresher. Since Quip uses GitHub for our codebase (and I had also migrated all my projects there a couple of years ago), it seemed like it would be possible to build a Timehop-like service for my past commits via their API.

I had also wanted to try building something with Go², and this seemed like a good fit. Between go-github and goauth2, the “boring” bits would be taken care of. App Engine's Go runtime also made it easy to deploy my code, and it didn't seem like this would be a very resource-intensive app (famous last words).

I started experimenting over Fourth of July weekend, and by working on it for a few hours a week I had it emailing me my daily digests by the end of the month. At this point I ran into what Akshay described as the “eh, it works well enough” trough, where it was harder to find the motivation to clean up the site so that others could use it too. But eventually it did reach a “1.0” state, including a name change, ending up with RetroGit.

The code ended up being quite straightforward, though I'm sure I have quite a ways to go before writing idiomatic Go. The site employs a design similar to Tweet Digest, where it doesn't store any data beyond an OAuth token, and instead just makes the necessary API calls on the fly to get the commits from years past. The GitHub API behaved as advertised — the only tricky bit was how to handle the my aforementioned migrated repositories. Their creation dates were 2011-2012, but they had commits going back much further. I didn't want to “probe” the interval going back indefinitely, just in case there were commits from that year — in theory someone could import some very old repositories into GitHub³. I ended up using the statistics endpoint to determine when the first commit for a user was in a repository, and persisting that as a “vintage” timestamp.

I'm not entirely happy with the visual design — I like the general “retro” theme, but I think executing it well is a bit beyond my Photoshop abilities. The punch card graphic is based on this “Fortran statement” card from this collection. WhatTheFont! identified the header font as ITC Blair Medium. Hopefully the styling within the emails is restrained enough that it won't affect readability. Relatedly, this was my first project where I had to generate HTML email, and I escaped with most of my sanity intact, though some things were still annoying. I found the CSS compatibility tables from MailChimp and Campaign Monitor, though I'm happy that I don't have care too much about more “mass market” clients (sorry Outlook users).

As to whether or not RetroGit is achieving its intended goal of helping me keep more of the Quip codebase in my head, it's hard to say for sure. One definite effect is that I pay more attention to commit messages, since I know I'll be seeing them a year from now. They're not quite link bait, but I do think that going beyond things like “Fixes #787” to also include a one-line summary in the message is helpful. In theory the issue has more details as to what was broken, but they can end up being re-opened, fixes re-attempted, etc. so it's nice to capture the context of a commit better. I've also been reminded of some old TODOs and done some commenting cleanups when it became apparent a year later that things could have been explained better.

If you'd like to try it yourself, all the site needs is for you to sign in with your GitHub account. There is an FAQ for the security conscious, and for the paranoid running your own instance on App Engine should be quite easy — the README covers the minimal setup necessary.

  1. It took me a while to stop having hangups about not choosing the most optimal/scalable solution for all problems. I didn't skew towards “over-engineered” solutions at Google, but somehow enough of the “will it scale” sentiment did seep in.
  2. My last attempt was pre-Go 1.0, and was too small to really “stick”.
  3. Now that Go itself has migrated to GitHub, the Gophers could use this to get reminders of where they started.

Gmail's HTML Tag Whitelist #

I couldn't find a comprehensive list of the HTML tags that Gmail's sanitizer allows through, so I wrote one up.

Using ASan with iOS Applications #

I've written up a quick guide for getting ASan (Address Sanitizer) working with iOS apps. This is the kind of thing I would have put directly into this blog in the past, but:

  1. Blogger's editor is not pleasant to use — I usually end up editing the HTML directly, especially for posts with code blocks. Not that Quip doesn't have bugs, but at least they're our bugs.
  2. Quip has public sharing now, so in theory that doc should be just as accessible (and indexable) as a regular post.

However, I still like the idea of this blog being a centralized repository of everything that I've written, hence this "stub" post.

Adding Keyboard Shortcuts For Inspecting iOS Apps and Web Pages in Safari #

Back in iOS 6 Apple added the ability to remotely inspect pages in mobile Safari and UIWebViews. While I'm very grateful for that capability, the fact that it's buried in a submenu in Safari's “Develop” menu means that I have to navigate a maze with a mouse every time I relaunch the app. I decided to investigate adding a way of triggering the inspector via a keyboard shortcut.

iPhone Simulator menu
The target

My first thought was that I could add a keyboard shortcut via OS X's built-in support. After all, “mobile.html” is just another menu item. Something like:

iPhone Simulator menu
If only it were so easy

Unfortunately, while that worked if I opened the “Develop” menu at least once, it didn't on a cold start of Safari. I'm guessing that the contents of the menu are generated dynamically (and lazily), and thus there isn't a “mobile.html” item initially for the keyboard shortcut system to hook into.

Inspired by a similar BBEdit script, I then decided to experiment with AppleScript and the System Events UI automation framework. After cursing at AppleScript for a while (can't wait for JavaScript for Automation), I ended up with:

tell application "Safari" to activate
tell application "System Events" to ¬
    click menu item "mobile.html" of menu ¬
        "iPhone Simulator" of menu item "iPhone Simulator" of menu ¬
        "Develop" of menu bar item "Develop" of menu bar 1 of process "Safari"

That seemed to work reliably, now it was just a matter of binding it to a keyboard shortcut. There apps like FastScripts that provide this capability, but to make the script more portable, I wanted a way that didn't depend on third-party software. It turned out that Automator can be used to do this, albeit in a somewhat convoluted fashion:

  1. Launch Automator
  2. Create a new “Service” workflow
  3. Add a “Run AppleScript” action¹
  4. Change the setting at the top of the window to “Service receives no input in any application“
  5. Replace the (* Your script goes here *) placeholder with the script above (your workflow should end up looking like this)
  6. Save the service as “Inspect Simulator”

I wanted to attach a keyboard shortcut to this service when either Safari or the simulator were running, but not in other apps. I therefore then went to the “App Shortcuts” keyboard preferences pane (pictured above) and added shortcuts for that menu item in both apps (to add shortcuts for the simulator, you need to select the “Other…” option in the menu and select it from /Applications/

One final gotcha is that the first time the script is run in either app, you will get a “The action 'Run AppleScript' encountered an error.” dialog. Immediately behind that dialog is another, saying “'' would like to control this computer using accessibility features.” You'll need to open the Security & Privacy preferences pane and enable Safari (and the simulator's) accessibility permissions.

  1. Not be confused with the “Execute AppleScript” action, which is a Remote Desktop one — I did that and was puzzled by the “no computers” error message for a good while.

Per-Package Method Counts for Android's DEX Format #

Quip's Android app recently ran into the Android DEX/Dalvik 64K method limit. We suspected that this was due to code generated by the Protocol Buffer compiler¹, but we wanted to get more specific numbers, to both understand the situation better and track our progress. As a starting point, we figured per-package method counts would give us what we needed.

The Android SDK ships with a dexdump tool that disassembles .dex (or .apk files) and dumps certain information out of it. Running it with the -f flag generated a method_ids_size line that showed that we were indeed precariously close to the limit. The script supports an XML output and per-class output of methods, so it seemed like a straightforward task to group methods and classes by package. However, once I actually processed its output, I got a much lower number than expected (when I did a sanity check to add up all the per-package counts). It turned out that the XML output is hardcoded to only output public classes and methods.

I then held my nose and rewrote the script to instead parse dexdump's text format. Unfortunately, even then there was some undercounting — not as significant, but I was missing a few thousand methods. I looked at the counts for a few classes, and nothing seemed to be missing, so this was perplexing. After some more digging, it turned out that the limit counts referenced methods too, not just those defined in the DEX file. Therefore iterating over the methods defined in each class was missing most of the android.* methods that we were calling.

Mohammad then pointed me at a script that used the smali/baksmali assembler/disassembler to generate per-package counts. However, when I ran it, it seemed to overcount. Looking into it a bit more, it looked like the script disassembled the .apk, re-assembled it to generate a .dex per package, and then ran dexdump on each one. However, this meant that referenced methods were counted by each package that used them, thus the overall count would include them more than once.

I briefly considered modifying dexdump to extract the information that I needed, but it didn't seem like a fun codebase to work in; besides being in C++ it had lots of dependencies into the rest of the Android tree. Looking around for other DEX format parses turned up smali's, dexinfo, dexinsight, dexterity, dexlib and a few others. All seemed to require a bit more effort to build and understand than I was willing to put in late on a Friday night. However, after browsing around through the Android tree more, I came across the dexdeps tool². It is designed for separating referenced and defined methods (and classes), but its DEX file parser looked simple enough to modify to extract the data that I was interested in. Better yet, it had no other dependencies, and looked straightforward to build.

Sure enough, it was pretty easy to modify it to create a per-package method counting tool. After a few more commits, I ended up with a dex-method-counts tool that can be pointed at an APK (or DEX file) and provide a package hierarchy tree-view of defined and referenced method counts. The README has a few more details, including a few flags that I've found useful when looking at protocol buffer compiler-generated code.

As for how we solved our actual method count limit problem, we've so far managed to stave off doom by refactoring our .proto files to include fewer messages in our Java build (we were picking up some that were for other platform or server use only). That is, nothing yet.

  1. For others in this situation, Square's Wire library may be an alternative.
  2. Somewhat amusingly, this is not the only Java-based DEX parser in the Android source tree, there is also dex-tools in the Compatibility Test Suite area.

Finding Messages Explicitly Marked as Spam in Gmail #

tl;dr: Search Gmail for “is:spam -label:^os” to find messages that you manually marked as spam (as opposed to ones that Gmail automatically marked for you).

Gmail recently had a bug where some emails were accidentally moved to the trash or marked as spam. Google “encouraged” users that might have been affected to check their trash and spam folders for any messages that didn't belong. Since I get a lot of spam (one of the perks of having the same email address since 1996), I didn't relish the thought of going through thousands of messages to see if any of them were mislabeled¹.

I figured that Gmail must keep track of which messages were explicitly marked as spam by the user versus one that it automatically classifies (though I get a lot of spam, almost all of it is caught by Gmail's filters). Gmail (like Google Reader) keeps track of per-message state via internal system labels. For example, others have discovered that Gmail's Smart Labels are represented as ^smartlabel_type labels while Superstars uses names like ^ss_sy. Indeed, if you try to use a caret in a label name, Gmail says that it is not allowed.

It therefore seemed like a reasonable assumption that there was a system label that would tell us how a message came to be marked as spam. The problem was to figure out what it was called.

Thinking back to Reader (where all label operations went through an edit-tag HTTP API call, which listed the labels to added or removed), I figured I would see what the request was when marking a message as spam. Unfortunately, it looked like Gmail's requests were of slightly higher abstraction level, where marking a message as spam would send a request with an act=sp parameter (while marking as read uses act=rd, and so on).

I then figured I should look at HTTP response when loading the spam folder. There appeared to be a bunch of system label names associated with each message. One that I explicitly marked as spam had the labels:

"^a", "^ad_1391126400000", "^all", "^bsm"," ^clu_group", "^clu_unim", "^cob-processed-gmr", "^cob_pevent", "^oc_group", "^os_group", "^s", "^smartlabel_group", "^u"

Meanwhile, another that had been automatically marked as spam used:

"^ad_1391126400000", "^all"," ^bsm", "^clu_notification", "^cob-processed-gmr", "^oc_notification", "^os", "^os_notification", "^s", "^smartlabel_notification", "^u”

^s was present on all of them, and indeed doing a search for label:^s shows all spam messages (and the UI rewrites the search to in:spam). Others could also be puzzled out based on name, for example ^u is for unread messages. The more mysterious ones like ^cob_pevent I figured I could ignore².

After looking at a bunch of messages, both automatically and manually marked as spam, ^os stood out. It only seemed to be present on messages that Gmail itself had decided were spam. Doing the search is:spam -label:^os seemed to show only messages that I had marked as spam. Indeed, each of the messages in the result displayed the header: "Why is this message in Spam? You clicked 'Report spam' for this message." Thus I was able to go through the much shorter list and see if any where mistakenly marked (they weren't).

Seeing the plethora of labels that were present on all messages, I got curious what other internal labels there were. Between examining HTTP responses, looking through Gmail's JavaScript for strings that start with ^ and a simple dictionary attack for two-letter names, here's some others that I've found (those that are marked as “unknown” are ones that match some messages in my account, but with no apparent pattern):

  • ^a: archived conversations
  • ^b: chat transcripts (equivalent to is:chat, presumably the “b” is for “Buzz”, Google Talk's codename)
  • ^f: sent messages (equivalent to is:sent)
  • ^g: muted conversations (equivalent to is:muted, the “g” is most likely for “ignore”)
  • ^i: inbox (equivalent to in:inbox)
  • ^k: trashed messages (equivalent to in:trash, unclear why “k” is the abbreviation)
  • ^o: unknown
  • ^p: messages that were marked as phishing attempts
  • ^r: drafts (equivalent to is:draft)
  • ^s: spam (equivalent to is:spam)
  • ^t: starred messages (equivalent to is:starred, the “t” is most likely for “to do”)
  • ^u: unread messages (equivalent to is:unread)
  • ^ac: Google Buzz messages (equivalent to is:buzz)
  • ^act: Google Buzz messages (unclear how it's different from ^ac)
  • ^af: unknown
  • ^bc: unknown subset of chat transcripts
  • ^p_cc: another unknown subset of chat transcripts
  • ^fs: unknown
  • ^ia: unknown
  • ^ii: unknown
  • ^im: unknown
  • ^iim: Priority Inbox (based on Android's documentation)
  • ^mf: unknown
  • ^np: unknown
  • ^ns: unknown
  • ^bsm: unknown
  • ^op: messages that were automatically marked as phishing attempts
  • ^os: messages that were automatically marked as spam
  • ^vm: Google Voice voicemails (equivalent to is:voicemail)
  • ^pop: unknown, seems to match some (very old messages) that I imported via POP
  • ^ss_sy, ^ss_so, ^ss_sr, ^ss_sp, ^ss_sb, ^ss_sg, ^ss_cr, ^ss_co, ^ss_cy, ^ss_cg, ^ss_cb, ^ss_cp: Superstar stars
  • ^sl_root, ^smartlabel_promo, _receipt, _travel, _event, _group, _newsletter, _notification, _personal, _social, _receipt and _finance: Smart Labels
  • ^io_im: important messages (equivalent to is:important)
  • ^io_imc1 through ^io_imc5, ^io_lr: unknown, possibly more degrees of importance (“Info Overload” was the project that resulted in the importance filtering)
  • ^clu_unim: unknown, possibly unimportant messages
  • ^unsub and ^hunsub: messages where an unsubscribe link has been detected (when marking one as spam, the “In addition to marking this message as spam, you can unsubscribe...” dialog appears). ^unsub seems to be for messages where there's an unsubscribe link you have to click while ^hunsub is for ones where Gmail offers to unsubscribe on your behalf.
  • ^cff: sender is in a Google+ circle (equivalent to has:circle)
  • ^sps: unknown (no matches in my account, but it was referenced in the JavaScript next to ^p, if I had to guess I would say it's something related to spear phishing)
  • ^p_esnotif: Google+ notifications ("es" presumably being "Emerald Sea", Google+'s code name)
  1. Of course, in deciding to automate this task, I doomed myself to spend more time that I would have if I'd just gone through the messages by hand.
  2. It's somewhat interesting to see how features that were developed later (like Smart Labels — ^smartlabel_group) use longer system label names than ones of medium age (like Superstars — ^ss_sy) which are in turn longer than the original system labels (^u for unread, etc.). Bytes 10 years ago were clearly more precious.

Using Google Reader's reanimated corpse to browse archived data #

Having gotten all my data out of Google Reader, the next step was to do something with it. I wrote a simple tool to dump data given an item ID, which let me do spot checks that the archived data was complete. A more complete browsing UI was needed, but this proved to be slow going. It's not a hard task per se, but the idea of re-implementing something that I worked on for 5 years didn't seem that appealing.

It then occurred to me that Reader is a canonical single page application: once the initial HTML, JavaScript, CSS, etc. payload is delivered, all other data is loaded via relatively straightforward HTTP calls that return JSON (this made adding basic offline support relatively easy back in 2007). Therefore if I served the archived data in the same JSON format, then I should be able to browse it using Reader's own JavaScript and CSS. Thankfully this all occurred to me the day before the Reader shutdown, thus I had a chance to save a copy of Reader's JavaScript, CSS, images, and basic HTML scaffolding.

zombie_reader is the implementation of that idea. It's available as another tool in my collection. Once pointed at a directory with an archive generated by reader_archive, it parses it and starts an HTTP server on port 8074. Beyond serving the static resources that were saved from Reader, the server uses to implement a minimal (read-only) subset of Reader's API.

The tool required no modifications to Reader's JavaScript or CSS beyond fixing a few absolute paths1. Even the alternate header layout (without the Google+ notification bar) is something that was natively supported by Reader (for the cases where the shared notification code couldn't be loaded). It also only uses publicly-served (compressed/obfuscated) resources that had been sent to millions of users for the past 8 years. As the kids say these days, no copyright intended.

A side effect is that I now have a self-contained Reader installation that I'll be able to refer to years from now, when my son asks me how I spent my mid-20s. It also satisfies my own nostalgia kicks, like knowing what my first read item was. In theory I could also use this approach to build a proxy that exposes Reader's API backed by (say) NewsBlur's, and thus keep using the Reader UI to read current feeds. Beyond the technical issues (e.g. impedance mismatches, since NewsBlur doesn't store read or starred state as tags, or has per item tags in general) that seems like an overly backwards-facing option. NewsBlur has its own distinguishing features (e.g. training and "focus" mode)2, and forcing it into a semi-functional Reader UI would result in something that is worse than either product.

  1. And changing the logo to make it more obvious that this isn't just a stale tab from last week. The font is called Demon Sker.
  2. One of the reasons why I picked NewsBlur is that it has been around long enough to develop its own personality and divergent feature set. I'll be the first to admit that Reader had its faults, and it's nice to see a product that tries to remedy them.

Image-based SVG Masking #

Image-based masking was first introduced by WebKit a few years ago, and has proven to be a useful CSS feature. Unfortunately browsers without a WebKit lineage do not support it, which makes it a less than appealing option for cross-browser development. There is however the alternative of SVG-based masking, introduced by Firefox/Gecko at least partly in response to WebKit's feature. My goal was to find some way to combine the two mechanisms, so that I could use the same image assets in both rendering engines to achieve the masking effect. My other requirement was that the masks had to be provided as raster images (WebKit can use SVG file as image masks, but some shapes are complex enough that representing it as a bitmap is preferable).

SVG supports an <image> element, so at first glance this would just be a matter of something like:

.mask {
  -webkit-mask: url("mask.png");
  mask: url(#svgmask);

  <mask id="svgmask">
    <image xlink:href="mask.png" />

Unfortunately, depending on your mask image, when trying that, you will most likely end up with nothing being displayed. A more careful reading shows that WebKit's image masks only use the alpha channel to determine what gets masked, while SVG masks use the luminance. If your mask image has black in the RGB channels, then the luminance is 0, and nothing will show through.

SVG 2 introduces a mask-type="alpha" property that is meant to solve this very problem. Better yet, code to support this feature in Gecko landed 6 months ago. However, the feature is behind a layout.css.masking.enabled about:config flag, so it's not actually useful.

After more exploration of what SVG can and can't do, it occurred to me that I could transform an alpha channel mask into a luminance mask entirely within SVG (I had initially experimented with using <canvas>, but that would have meant that masks would not be ready until scripts had executed). Specifically, SVG Filters can be used to alter SVG images, including masks. The feColorMatrix filter can be used to manipulate color channel data, and thus a simple matrix can be used to copy the alpha channel over to the RGB channels. Putting all that together gives us:

.mask {
  -webkit-mask: url("mask.png");
  mask: url(#svgmask);

  <filter id="maskfilter">
    <feColorMatrix in="SourceAlpha"
                   values="0 0 0 1 0
                           0 0 0 1 0
                           0 0 0 1 0
                           0 0 0 1 0" />

  <mask id="svgmask">
    <image xlink:href="mask.png" filter="url(#maskfilter)" />

I've put up a small demo of this in action (a silhouette of the continents is used to mask various textures). It seems to work as expected in Chrome, Safari 6, and Firefox 21.

Source Quicklinks #

It occurred to me that I never blogged about my Source Quicklinks extension (source). I created it back in 2010, when I started working on the Chrome team, focusing on the WebKit side of things. I was spending a lot of time in Code Search trying to understand how things worked. Chromium's code search instance searches not just the Chromium repository itself, but also all its dependencies (WebKit, V8, Skia, etc.). A lot of times the only way to understand a piece of code was to look at the commit that added it (especially in WebKit code, where comments are scarce but the bug associated with the commit often provides the back story). I was therefore doing a lot of URL mangling to go from Code Search results to Trac pages that had "blame" (annotation) views.

Source Quicklinks screenshot

The extension made this process easier, adding a page action that provides back and forth links between Code Search, the Chromium repository (both the ViewVC and the Gitweb sites), WebKit's Trac setup and V8. Later, when the omnibox API became available, it also gained a "sql" search shortcut for the Chromium repository and commits.

Though I no longer work on Chrome, I still find myself going through those code bases quite often. For example, the best way to know whether something triggers layout in Blink/WebKit is by reading the code. I've therefore revved the extension to handle the Blink fork, in addition to cleaning up some other things that had started to bitrot. I also attempted to add cross-links between the WebKit and Blink repositories that takes into account the Blink reorganization, though we'll see how useful that ends up being as the codebases diverge more and more.

A REPL for Chrome Apps APIs #

A few months ago, as I was making yet another test app to demonstrate a Chrome packaged app API, I wished for a REPL. In theory the Chrome Dev Tools would fit the bill, since the console lets you run arbitrary JavaScript statements. However, using the dev tools would still involve making a manifest with the right permissions and loading an unpacked app, and at least a background page if not an actual window to inspect. Once you inspected the right page, invoking and inspecting the results of asynchronous APIs would be tedious, with a lot of boilerplate to type every time.

I started to think about creating a purpose-built REPL for Chrome apps APIs. A generic REPL seemed out of the question, due to eval being disallowed due to the strict Content Security Policy used by apps. My initial thought involved a dropdown listing all functions in the chrome.* namespace and a way to invoke them with canned values (eval may be disallowed, but dynamic invocation of the form chrome[namespace][methodName](arg) is still possible). However, that seemed clunky, and wouldn't help with APIs like the socket one that need to chain several method calls with the parameters for one depending on the results of another.

I then thought more about the eval limitation, and if I could use sandboxed pages to create the REPL environment. In some ways that seemed contradictory; the whole point of sandboxed pages is that they don't have access to Chrome APIs (unlike the main frame/page). In exchange they can use less safe mechanisms such as eval (a form of privilege separation). However, sandboxed pages can communicate with the containing page and get data from them via postMessage1. In theory the input code could be eval-ed in the sandboxed frame, and when it tried to invoke Chrome APIs, the sandboxed frame would postMessage to the main frame, ask it to run that API method, get the result, and plug it back in the expression that was being evaluated.2

This plan hinged on fact that nearly all Chrome apps APIs are asynchronous already, thus it should be possible to create seemingly functionally identical proxies in the sandboxed frame. That way, as far as the user is concerned, they're running the original API methods directly. There would need to be some additional bookkeeping to make callback parameters work, but there was no technical barrier anymore.

Before talking about that bookkeeping, since we're now five paragraphs into the blog post, I should cut to the chase and give you a link to the REPL app that I ended up building: App APIs REPL (source). And if you'd like to see it in action, here's a screencast of it showing basic JavaScript expression evaluation and then a more complex example playing around with the socket API to mimic HTTP requests to

Here's how eval-ing the following statement works:

    function(createInfo) {
  1. The main frame (also referred to as the "host" in the source code) gets the input and sends it to the sandboxed frame via a EVAL message. The sandbox dutifully evals it.
  2. chrome.socket.create is a stub that was created in the sandboxed frame: at application startup, the main frame walks over the chrome.* namespace and gathers all properties into a map and sends them to the sandbox (via a INIT_APIS message). The sandbox re-creates them, and for function properties and events a stub is generated.
  3. When the stub is invoked, it sends a RUN_API_FUNCTION message to the main frame with the API method (chrome.socket.create in this case) that should be run and its parameters. Most parameters can be copied directly via the structured clone algorithm that is used by postMessage.
  4. However, the second parameter is function that cannot be copied. Instead we generate an ID for it, put it in a pending callbacks map, and send the ID in its place.
  5. On the main frame side, the list of parameters is reconstructed. For function parameters, we generate a stub based on the ID that was passed in. Once we have the parameters, we invoke the API function (via dynamic invocation, see above) with them.
  6. When the stub function that was used as the callback parameter is invoked, it takes its arguments (if any), serializes them and then sends them and its function ID back to the sandboxed frame via a RUN_API_FUNCTION_CALLBACK message.
  7. The sandboxed frame looks up the function ID in the callbacks map, deserializes the parameters, and then invokes the function with them.
  8. The callback function uses the log() built-in function. That ends up sending a LOG message to the main frame with the data that it wants logged to the console.

Events work in a similar manner, with stubs being generated for add/removeListener() in the sandbox that end up adding/removing listeners in the main frame. There are two maps of listener functions, one in the sandboxed frame from ID to real listener, and one in the main frame from ID to stub/forwarding listener. This allows removing of listeners to work as expected.

The console functionality of the REPL is provided by jqconsole, which proved to the very easy to drop in and hook up input and output to. History of the console is persisted across app restarts via the storage API. Additional built-in commands like help and methods (which dumps a list of all available API methods) as implemented as custom getters getters in the global JavaScript namespace of the sandboxed frame. There's also a magic _ placeholder that can be used as a callback parameter or event listener; it will be replaced with a generated function that logs invocations.

In addition to being a useful developer and leaning tool, I hope that this REPL also helps with thinking with a sandboxed mindset. I know that the Content Security Policy that's used in apps has been controversial, with some taking it better than others. However, I think that privilege separation, declarative permissions, tying capabilities to user gestures/intent and other security features of the Chrome apps runtime are here to stay. CSP is applicable to the web in general, not just apps. Windows 8 requires sandboxing for store apps and its web-based apps are taking an approach similar to CSP to deter XSS. Sandboxing was one of the main themes for Mac desktop developers this year, with Apple finally pulling the trigger on sandbox requirements. Developers of large, complex applications were able to adapt them to the Mac OS X sandbox. That gives me hope that the Chrome app sandbox will not prevent real apps from being created. It's is starting with the even more restrictive web platform sandbox and relaxing it slightly, but is generally aiming for the same spot as the Mac one.

I'm also hopeful that there will be improvements that make it even easier to write secure apps. For example, the privilege isolation provided by sandboxed pages was inspired by a USENIX presentation (the paper presupposed no browser modifications, the Chrome team just paved the cowpath).

  1. pkg.js is a library that's cropped up recently for making such main/sandboxed frame communication easier.
  2. Note that this not the desired pattern for communication between the main and sandboxed frames. Ideally messages that are passed between the two should be as high-level as possible, with application semantics, not low-level Chrome API semantics. For example, if your sandboxed frame does image processing, it shouldn't get to pick the image paths that it reads/writes from; instead it should be given (and return) a blob of image data; it's up to the main frame to decide where it gets that image data (by reading a path on disk, from the webcam, etc.). Otherwise if the code in the sandbox is malicious, it could abuse the file I/O capability.