Jason Horman

Created this User CSS to make reddit look more like Hacker News.

Created this User CSS to make reddit look more like Hacker News.

Android StrictMode

I didn’t know about this. Going to turn it on in Springpad’s Android build and see what happens.

 StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder()
                 
.detectDiskReads()
                 
.detectDiskWrites()
                 
.detectNetwork()   // or .detectAll() for all detectable problems
                 
.penaltyLog()
                 
.build());

VirtualBox for dev teams

At Springpad the development team has spent a lot of time setting up our local environments. The setup includes:

  • MySQL for Quartz job scheduling and random other lookups
  • Liquibase for creating and updating MySQL schemas
  • Cassandra instances for storing block data
  • SOLR instances for faceting and full text search
  • Zookeeper for atomic counters and distributed locks
  • Memcached
  • Redis (for some new development we are doing)
  • And more…

We documented the ever changing setup. We wrote shell scripts to automate the process. For Python we switched from shell scripts to Fabric. In the end though it still felt cumbersome to get a new environment setup for development.

VirtualBox

We realized just a few months back (we are probably late to the game) that we could setup the entire stack in a VirtualBox instance. Not only that, but we could locally run our services exactly as they run in production, on a Debian distribution. We are now setting developers up with:

  1. VirtualBox running a snapshot of a minimal Debian installation. (This instance runs in about 163MB of memory)
  2. The instance is set to Bridged Adapter network mode so that the machine is directly addressable.
  3. Then we install all of the services listed above, and snapshot again calling this “Initial Springpad”. (This instance runs in about 352MB of memory)
  4. We then can copy this setup to the various development machines.
  5. From there the developer can start the “Initial Springpad” instance and connect to it directly from their local OSX development environment.

Nice Benefits

  1. Fast start/stop: To conserve memory, or at the end of the day, you can simply tell VirtualBox to “Save machine state”. VirtualBox will pause the VM and shutdown. This takes about 5 seconds. When you want to restart development, VirtualBox will unpause the VM, also in about 5 seconds. Contrast that with starting and stopping the 10 or so services we had running on OSX before.
  2. Snapshots: Whenever you feel like it you can take additional snapshots. Before testing that bulk delete operation from Cassandra, take a snapshot. Test, throw the snapshot away and revert to the original instance of Cassandra. Very powerful.
  3. Clean Mac: Using VirtualBox moved all of the configuration, log, and installation files of all of these services out of our personal Mac OSX /usr/local, /var/log, /etc directories. They are now neatly tucked away in the VirtualBox instance.

So I definitely would recommend checking out VirtualBox if you want to simplify your local development environment.

Picture from the back.

Picture from the back.

Really liking the new monitor arm for the LCD. It swivels, extends/retracts, rotates, and has a really nice look.

Really liking the new monitor arm for the LCD. It swivels, extends/retracts, rotates, and has a really nice look.

Pycharm OSX install graphic

Pycharm OSX install graphic

Servers are only as good as their clients

When we started working with Cassandra we experimented with different Java and Python clients. While they all supported the required APIs, and they each added their own sugar on top, not all of them had thought through how to architect a client in a way that matched the power of Cassandra itself. A backend service, whether it is SOLR, memcached, Redis, or Cassandra, is only as good as the client libraries available.

Connection management

  • Connection pooling.
  • +1 for “smart” pooling that picks the least loaded backend.
  • Auto reconnect
  • Auto removal of bad connections from the pool. Auto recovery on server recovery.
  • Connect/read timeouts.
  • +1 for auto discovery of available hosts. (Zookeeper, Cassandra)

Logging/Operations

  • Configurable logging for monitoring.
  • Some way of getting performance metrics, error rates, current configuration.
  • In Java, JMX config, stats, cluster management.
  • +1 for a simple Rest/JSON based management API, stats, config, cluster management, etc.
  • API for explicit overrides. Ops generally wants the ability to say sorry $CLIENT_LIB, I don’t trust you, and I know that node 11 is down right now.

It is actually pretty hard to get all of this right. Why start from scratch though. Take a look at Hector or Telephus for Cassandra, or java-memcached-client.

Twisted timeouts

Here is a useful snippet we use when working with twisted. It is nice to be able to just decorate a method with a timeout.

Async IO

Some of the backend systems that springpad uses are being written in Python and use Twisted. Twisted, tornado, eventmachine, netty, are frameworks that ease development of services that want to take advantage of non-blocking IO. Instead of scaling via more threads, or more processes, that are IO bound, the frameworks make use various techniques to enable asynchronous notifications when IO is available.

For Springpad, this makes a ton of sense. We connect to a large number of 3rd party services. Amazon, Yelp, Google, Pricegrabber, Netflix, and more. We can’t control the response time of these services, and dedicating threads to just wait for responses can be expensive.

When I read online about these frameworks, I generally find information about how they can scale to thousands of connections, can be used to implement real time services, how one is better than the other because it can handle yet another couple of hundred connections, how threads can actually outperform async IO under certain conditions. What I don’t generally see people talking about -

Async IO can remove the need for threads

In many cases async IO can remove the need for threads in your application. Thread programming can be difficult. Many libraries are not thread safe (Java SimpleDateFormat, many python libs) and so you have to be very defensive. Some people are very good at it, but there is plenty of evidence to suggest that getting multithreaded programming right is hard. I personally have been enjoying switching my mindset to single threaded. That said, async IO is still concurrent. You still have to be careful when dealing with shared state.

Many web 2.0 services are IO bound

Many services spend a great deal of time talking to APIs outside of their control. Threads can be expensive, and hard to work with, when all you are trying to do is fetch some data from a web service. Internal queries to databases, nosql stores, memcached, all generally end up being IO bound. Springpad has code that fetches >100 items at a time from memcached and/or Cassandra.

The Debate

So sometimes I think that the debates about the need for, or the performance of, async IO frameworks, are missing a key part of the positive argument. There is inherent value in them in that the can lead to cleaner code with fewer bugs, while at the same time probably increasing scalability quite a bit.

Tools and tech at springpadit.com

  • Yammer - We use Yammer to communicate internally. This includes project updates, build status, basketball bracket updates, and general shit giving.
  • Java - Much of the original springpad backend is built in Java. Some of the front end is as well (GWT).
  • Python - Some of the newer services are being built in Python. Specifically on top of Twisted Cyclone
  • Intellij/Pycharm - We are big fans of jetbrains.
  • JIRA - We/I have looked for alternatives many times. You just can’t beat the flexibility of JIRA though.
  • Confluence - Again, a very flexible tool. In addition to holding documentation we have automated processes that add reporting, build information, user feedback, all to confluence.
  • SOLR - Our full text index, though we are investigating switching to elastic search.
  • Cassandra - The main data store springpad uses. Replicated, self healing, scalable.
  • GTalk - For rapid un-yammerable things.