Table of Contents
Playdeo was a mobile games studio set up by Jack Schulze, Timo Arnall and myself in 2016 to explore the possibilities of full screen video married with a 3D game engine.
This union was very novel, with almost no precendent for the kinds of interaction opportunities it offered. In situations like these, the work ahead would require a lot of material exploration, and thinking through making. Iteration is key, and in this article I’ll be show some examples of the various prototypes we built, a rough timeline, and the key milestones we reached which finally unlocked our first fully shipped game, Avo.
In writing this piece, there are many aspects of this work which have only become apparent to me in hindsight. Material exploration can be a focused and somewhat myopic state to work in. You’ve not yet got any perspective, as there’s no whole form from which to stand back and assess. You set out to create user-centric and experiential milestones, and assemble a rough technical infrastructure to support it. You want to get to an end-to-end build by taking as many shortcuts as possible.
As humans, we love thinking about these cumulative periods of iterative work in terms of Eureka! moments, or talk of having overnight successes. I was guilty of thinking exactly this, with one key piece of work standing out in my memory of the work as THE moment it all came together, that being the interaction method of line drawing to move your character that Jon Topf came up with.
The truth is that there were numerous key moments spread over years, each usually providing a solid layer on which other work could sit. Sometimes these would be obvious, and sometimes they are more subtle, but each layer created new possibilities. In great teams, everyone contributes to this gradual layering process.
From your own perspective, there’s also a natual tendancy to discount your own work as less important than others, but that’s your own bias talking. What you’re working on is obviously not going to come as some sort of pleasant surprise. It’s also important to be humble and let work speak for itself. From an outsider’s perspective there will be many contributions from all corners that unlock a complex finished product, each important in its own right. Letting in a culture of rock star contributions is neither accurate nor healthy for long term teamwork.
So here we’ll see how we worked our way from a smoke-and-mirrors demo on a laptop through to a shipped iOS App Store game with 4 million downloads.
The Spark #
Back in 2015 Jack had been thinking about the possibilities of touchable video when he saw his young daughters reaching out and touching the screen on their iPad while watching a movie. Everything else you do with an iPad involves screen interaction, and in their minds video shouldn’t be any different. This make Jack wonder how true touch interaction with the video picture might actually work, and he set to work making a prototype. This was before my direct involvement, and I think he worked with Greg Borenstein to get things off the ground, having previously worked successfully with Greg at BERG.
Some time later, Jack invited me for a coffee and showed me his first working prototype, running on a laptop, that integrated 3 key elements together; the video, a 3D scene and camera motion tracking. The prototype barely held together. There was no audio, the file sizes were huge, playback wasn’t smooth, rendering was inefficient, the list went on. However, in spite of all that, there was something undeniably magical about it. You could render 3D objects into the video with rock solid believability, and you could interact contextually with anything on the screen. Better than the performance of AR at the time, and under the player’s control, unlike something pre-rendered.
The core elements were present, but to truly unlock a viable future for the tech, we knew we had to get a prototype working on mobile devices. This would immediately derisk the project by a huge amount, and since I’d spent time developing native iOS apps before, Jack wanted to know if this was something I’d be interested in working on. There was clearly enough here to show the enormous potential, so I eagerly hopped on board.
The first thing I worked on was video playback. There’s no way you could ship a product with JPEG flip-books, so we’d need a way of decoding video frames in realtime. After about 3 weeks of work in October 2015, I had a very crude iOS prototype working where we could decode true MP4 video frames, and pass that into Unity as a standard texture. Crucially, this also included discrete frame numbers, so we could look up the accompanying camera metadata when rendering the video frames. This gave us the confidence to start scaling up our work, and start working with someone who really knew the Unity engine. While I’d picked up just enough to get this small breakthrough working, Unity is a vast system, and we needed someone who was comfortable sketching and developing with it.
Throughout late 2015 and early 2016 Aubrey Hesselgren joined us on the very first prototypes. These were crude, and relied on a very ad-hoc processes of ingest and data manipulation, but we began to get a feel for the challenges ahead. Builds of our test code would only work on actual phones rather than inside the Unity editor. Building and deploying was slow, and overall iteration time was painful. We’d frequently have issues like playback synchronisation problems which involved a lot of trial and error debugging. To diagnose timing issues, we burned in timecode into the video to be able to check that the actual frame being displayed was the same as the number that was given to Unity by the plugin. Because this system had to run at a faultless 60 frames per second, we often used the ultra slow-motion video recording facility in iOS to check our synchronisation. Looking back now, I’m struck by how crude our debugging systems were, but back then we were at the cutting edge, and anything we needed we had to make ourselves. No part of the existing Unity ecosystem was geared up to help us with our unique problems.
As time went on, I began to bring formality and automation to our data pipeline and build process. The ingest process consisted of an ever growing Python script that formed the spine of multi stage data wrangling. Because we wanted to have near instantaneous access to any video frame, we used ffmpeg to concatenate all the videos together into a single seekable file. We used Autodesk’s FBX Python library to allow us to programatically get keyframe data from the camera track, rather than relying on Unity’s systems which always wanted to smooth this motion out. The script also started to enforce naming conventions to tie all the disparate elements together automatically, and it attempted to identify and reject human error in the upstream processes, and prevent bad data from entering builds. This would help reduce overall debugging time, even if it looked overly fussy from the video post-production team’s point of view.
For the build process we used Fastlane. I disliked Unity’s automatically generated Xcode projects, as it would frequently be out of step with iOS releases, and I wanted a way to manipulate the generated project files independently. I’d seen Fastlane put to excellent work by Tom Taylor in a previous job, and knew it represented the perfect Swiss Army Knife for manipulating, building and distributing our prototypes.
As we got more comfortable in working with this new medium, we focused on produced the Orange Car Demo, shown above. We felt this demonstrated the amazing potential of what we could make in an easily digestable way, and more importantly it ran directly on your phone. Off the strength of this demo, we sought some initial investment to start scaling Playdeo up to a full company. Games industry mogul Chris Lee joined us as a 4th founder, also unlocking the inner mysteries of the game industry, as Jack, Timo and I had not previously worked in this medium. We settled into a co-working space in Whitechappel, East London. Having spent much of the previous time camped out in Timo’s mother’s front room, 3 or 4 of us packed in like sardines, it was a welcome step. It was noisy and hectic, and inexplicably the tele-sales entrepreneurs would always love having their loudest conversations just outside our door. On the plus side we had one of the most important pieces of equipment, a huge whiteboard. I really think it’s the intellectual and spiritual hearth for people doing collaborative, inventive work.
By September 2016 we were working with Yera Diaz to make a Unity editor plugin for video playback. We would finally be able to prototype by simply hitting the play button on our laptops rather than requiring a whole iOS build to be made before we could see anything working. This would transform the experience of exploring this new medium, and accelerate our progress towards our first game.
In October 2016, Jonny Hopper and Mike Green from Glowmade briefly joined us to smarten up some of the core code, and to start thinking about gameplay and interaction. We experimented with a platform game, and at that stage we were still very much treating the phone like a TV. Landscape orientation, with virtual joysticks for control. We were starting to mould the codebase into something where we could truely experiment, and to derive a predictable and quick pipeline.
Quickly after, we started working on a number of prototypes. Day/Night, Time Travelling, Kerbside, Physics Toy, Clean Up My Mess. All of these helped bring forward the idea that you were playing in video, not just watching it. While each one was its own separate prototype concept, all of them explored touch interactions in various ways. Should the phone be horizontal or vertical? Was a virtual joystick really the best way for players to interact? How do you recieve feedback?
For some of our experiments, we needed a way of transitioning from a 2D touch to a 3D drag, as can be seen above. Although Avo never shipped with any draggable UI elements, we did need to blend our 2D and 3D touch thinking for the line drawing mechanic.
New offices, new people #
At the start of 2017 we moved to The Trampery Republic, a workspace in East India Dock in East London, and started to scale up our headcount. This was exciting but also put pressure on everybody, as our tools were still very much at an early stage. If you couldn’t program in Unity yourself, it was tough to achieve anything technically. We lacked sophisticated editor tools, so all design work would have to be through sheer imagination first and foremost, and this is particularly tough when working with a new medium that lacked a back catalog of reference material.
In April 2017 we started working with Super Spline Studios and Shanaz Byrne on a project we called Night Garden, featuring a character called Tolla seen above. It was our first time experimenting with humanoid animations, inverse kinematics, enemies and a whole slew of other features. It was an on-rails runner featuring a character called Tolla, with some but limited control over where the character was positioned, and a single long take video of Timo’s mum’s garden. Ultimately we didn’t take it forward as we felt there was not sufficiently diverse gameplay or replayability, but we were slowly improving our capabilities and ambitions. It’s at this point that we’d fully committed to the vertical orientation as our preferred way of holding the phone, and using a single finger for most interactions. It was the right balance between interactions, comfort and screen visibility. Bear in mind this was before Tiktok, YouTube Stories or any other large scale proof that vertical video would be accepted by our audience.
In August 2017 we briefly went back to the idea of controlling small cars on a race track, similar to the original orange car demo, but now shot vertically with the single finger interaction. This featured a small but crucial new facility for us, cutting to a different camera as you approached the edge of the screen. Although this brief exploration of racing wasn’t taken forward, the idea of cutting between cameras would stay, allowing the player to explore the 3D space under their own control. This threw up all kinds of interesting questions about continuity and video time vs game time. That is, the player experiences each use of a clip of video in a strictly linear fashion, so we had to be careful with the clips mutating global state. If we use a video clip where someone places down a cup of coffee, then each subsequent clip we use must show the coffee cup on the table. Video clips that were designed for reuse must be neutral as much as possible.
The next stage of prototyping was unlocked by Jonathan Topf. Because of his work on Trickshot, he had a good feel for players using a touch screen as the primary control system. His insight was to allow players to draw an intended path of movement for the character, rather than manipulating indirect controls like a touchpad or virtual joystick. I remember being really impressed at the time, and that this method of input was right for our game, and I commented on the commit in our Slack.
We shot a demo called Sharpen which featured a player controlled inanimate object, not yet an Avocado, but a pencil sharpener. During the intro, our human in the scene would drop a magic droplet of liquid on to a regular pencil sharpener and eyes would suddenly spring out, and it would be brought to life, much to the delight of the human.
In this demo, we had a basic task for the player, and they’d have to find some items strewn around the play area, giving us our very first taste of player tasks in the form of these fetch quests. It also allowed us to get a better feel for drawing lines to navigate, cutting between different cameras, each of which would need to carefully frame the playspace, and we also explored the idea of cinematic cameras, which looked good, but did not invite immediate interaction, as it might be in motion, or it might not frame the table surface in a way to make line drawing easy.
We also shot it with a very shallow depth of field, and this taught us an important lesson, because the more extreme the DoF effect, the less stable the motion track we got. If we were going to make a lot of these cameras work correctly, we needed to be much more modest with the focal plane depth and position.
Sharpen was then followed by a more fleshed out idea named Tiny Frankenstein. It kept the idea of inanimate objects brought to life, but incorporated Jon’s new procedural walking system, to give them characters much more life. Alpha 1 was shot with Jack as the mad inventor with a makeshift set designed to test blocking, but it was becoming increasingly obvious that we needed to start looking for an actor that would help us tell the story properly, and a space which could be larger and more lavishly decorated. Our pencil sharpener hero had now become an avocado.
Alpha 2 featured a custom built set, Katie Reece as our hero inventor Billie, an early version of Avo with no arms, props with special effects and a much stronger narrative, with lines of dialogue. This would be influenced by Wallace and Grommit, and a desire to keep things simple, so our protagonist would be silent. As much of the editing would be done outside of the game engine, in order to minimise the amount of unused video in the application, but still allow us to tweak the timing on sequences dynamically. This meant keeping handles on the clips. This is a post production term where you give yourself extra footage at the front and back of each clip, with the expectation that you’d start playing them 1 or 2 seconds from the beginning, and stop playing them 1 or 2 seconds from the end.
Alpha 2 got more and more polish, and it began to generate its own gravity. Exciting, fun, full of heart, and finally something we could scale up into a full game. Ryan North and Gemma Arrowsmith were brought in to help us create a fun story. It ended up being wildly ambitions and needed scaling back, but the bones of it were there. What followed was location scouting, set building, bespoke prop creation, full script development, table reads and all of the usual aspects of a full TV production, except done on a small budget, and very much in the guerilla film making school.
It’s at this point where we’d finished the raw invention phase, and tipped into production, polish and delivery. I cover a lot more of this in the technical post, but filming started in May 2018, and continued for approximately 10 weeks. I was split across supporting our data pipeline and writing systems in Unity. We used Black Magic’s Resolve, which used to corrupt timelines frequently, and I had to hook up a snapshot system for Postgres to allow us to roll back efficiently when this happened. I also had to work on scaling up the Python based processing script to cope with a vastly larger number of clips running through it. We were now under pressure to deliver a large volume of work quickly, so I needed to put in way more safeguards and cross checks to prevent human error from slipping by unnoticed.
From about September 2018 to January 2019 we implemented the 8 episodes seen in the main game. We added subtitle support, full music and sound effect support via Audiokinetic’s WWise, bluetooth audio support, the save checkpoint system, localisation, analytics, general UI, IAP integration, On-Demand Resource support, AR mode, low and high resolution videos, and a whole host of other things. We had no specific producer, so we took turns to run our weekly planning meetings. were crucial for establishing bottlenecks, and towards the end I was generally responsible for keeping the flow of work steady, as it became more and more technical. It was a remarkably intense time, and for the most part highly productive.
We finally launched Avo at the end of January 2019, and it has gone on to have nearly 4 million downloads, and is regularly promoted in the App Store to this day. For our first title I consider it a huge success. While it may seem coherent and polished from the outside, it really was a hard won product from 3 years of inventive exploration in a brand new medium.