Friday, February 3, 2012

Speaking at RubyNation and Moderating Are There Angels Among Us?

I'll be giving a talk at RubyNation in Reston, VA on March 23rd or 24th, tentatively titled "Coding for Uncertainty: How to Design Maintainable, Flexible Code for a Startup Application". I plan to discuss lessons learned from building OtherInbox and subsequent projects, and how I try to hit my maximum sustainable development speed.

I will also be moderating a Think Big Baltimore event called Are There Angels Among Us? on 2/16 here in Baltimore, all about angel investment in the mid-Atlantic region. I have a couple of free tickets to share if anyone reading this blog would like to check it out. Email me if interested.

Hope to see you there!

Thursday, February 2, 2012

The one best way I know of to write software tests

Early in 2011 I had a prophetic conversation with fellow Baltimore hacker Nick Gauthier that radically changed the way I think about testing web applications. He described a methodology where you almost exclusively write high-level acceptance or integration tests that exercise several parts of your code at once (vs. the approach I had previously used, writing unit tests for every component backed up by a few integration tests). For a Ruby app this means using something like Capybara (depicted below), Cucumber, or Selenium to test the entire stack, the way a user would interact with your site.

These tests aren't meant to be exhaustive - you don't test every possible corner case of every codepath. Instead you use them to design and verify the overall behavior of the system. For example, you might write a test to make sure your system can cope with invalid user input:

describe "Adding cues" do
  let(:user) { Factory(:user) }
    before { login_as_user(user) }

    it "handles invalid data" do
      visit "/cues/new"
      select "Generic", from: "Type"
      expect { click_button "Create Cue" }.not_to change(Cue,:count)
      should_load "/cues"
    end
  end
end

Usually with this technique you would not write a separate test for each type of invalid data since these tests like these are fairly expensive. Instead, you combine the test above with a series of unit tests which examine the components involved in the above behavior in an isolated fashion. Typically these tests will run much more quickly because they don't involve the overhead of setting up a request, hitting the database, etc.

In the above example we could cover all of the invalid cases with a model unit test that looks like this:

describe Cue do
  it { should validate_presence_of :name }
  it { should validate_presence_of :zip_code }
end

What you end up with is a small number of integration tests which thoroughly exercise the behavior of your code combined with a small number of extremely granular tests that run quickly and cover the edge cases.

One Criticism of This Approach

This idea has been working wonderfully for me. I feel like it gives me excellent code coverage without creating a massively-long running test suite. But I did notice Nick Evans critiquing this style of testing awhile ago on Twitter:
lots of integration tests + very few unit tests => a system that mostly works, but probably has ill-defined domain models and internal APIs.
The fact that it got retweeted and favorited a number of times makes me think he's onto something, though I haven't run into this problem yet, and I'm rigorous about keeping domain models and APIs clean. I have no problems refactoring in order to keep my average pace of development high. In my experience adhering to a strict behavior-driven development approach has kept me from running into the problem he describes, but that might not hold if I was part of a team. Time will tell.

Tuesday, January 31, 2012

My twelve-factor app development environment (Unicorn and Pound)

I've been hugely influenced by the twelve-factor app manifesto written by Heroku. I really like building apps on Heroku and think it's pretty great for prototyping things, because it totally abstracts the need to think about devops in the early stages of a project. I don't think Heroku is a panacea (I don't have experience running something large on it), but it's great for getting something going quickly.

That manifesto illustrates how you end up building your app if you launch it on Heroku. I've found the  principles greatly increase my productivity no matter what platforms I'm using. One thing that did take me a little while to figure out: most of my apps are SSL-only, and I always want to interact with them over SSL in my development environment because odd things can creep up in production if you never test SSL locally. (Most of the advice that Rails blogs give always point you towards shutting off SSL in development mode which seems crazy to me)

At first I couldn't quite figure out how to make SSL work while complying with factor 7, Port Binding. I was still using Phusion Passenger with SSL configured as I described in this very old article (which still gets a lot of traffic).

Now here's what I do. I run the Pound reverse proxy using this configuration:

# http://lvh.me
ListenHTTP
  Address 127.0.0.1
  Port    80

  Service
    BackEnd
      Address 127.0.0.1
      Port    8080
    End
  End
End

# https://lvh.me
ListenHTTPS
  Address 127.0.0.1
  Port    443
  Cert    "/usr/local/etc/pound.pem"
  AddHeader "X_FORWARDED_PROTO: https"

  Service
    BackEnd
      Address 127.0.0.1
      Port    8080
    End
  End
End

The /usr/local/etc/pound.pem file is a locally-generated, locally-signed SSL certificate.

When I want to view an SSL-only app in my local browser, I just start Unicorn (my app container of choice) which defaults to port 8080. Then I visit https://lvh.me which is helpfully set to the loopback address 127.0.01. That hits Pound, which terminates the SSL connection (after a warning about the self-signed certificate), and proxies the web request to Unicorn.

This is very similar to what happens when the app runs in production on Heroku's systems, except that they use a Procfile in the root directory of the app to configure Unicorn. Per factor 7, it allows Heroku to bind my Unicorn instance to any arbitrary port:

web: bundle exec unicorn -p $PORT -c config/unicorn.rb

Monday, January 30, 2012

Simple Resque lets you send Resque jobs from one codebase to another

I just released a small gem called simple_resque which abstracts a pattern that's become very common in my recent projects. I like using Resque as a job queue to move as much work out of the web application as possible. Unlike the usual Resque setup, I never put the workers in the same codebase as the web app. I like to keep the asynchronous parts of the app completely separate from the code that services web requests.

This required some hacking since Resque expects you to pass a Ruby class constant for the worker, but the webapp doesn't have those classes defined. simple_resque provides a thin wrapper over Resque's push method that mimicks the way Resque.enqueue works, but doesn't require you to use a class constant.

For more details check out: https://github.com/subelsky/simple_resque

Thursday, January 12, 2012

OtherInbox acquired by ReturnPath

I've written on and off about my experiences using Ruby and JavaScript and other technologies to build OtherInbox, the company I cofounded with Josh Baer. Today I'm happy to announce that the company has been acquired by ReturnPath

It was a great ride, and I learned a ton. Congratulations to the whole team! 

I am now starting another round of the entrepreneurial cycle and have started working on something new and very cool. As usual I don't plan to announce details until the business is up and running and ready for new customers. So stay tuned!

Friday, December 23, 2011

Who owns vacant properties in Baltimore?

I received many ideas for my free software project and ultimately settled on one suggested by Kate Bladow: a tool to help identify potential slumlords in Baltimore. It's specifically designed to help Baltimore Slumlord Watch investigations, though that anonymous blogger has nothing to do with this tool (he or she has to do complete investigations of each property before writing a post). This is more like an experiment to use all available data to identify people and companies who may own a large number of vacant properties.

The tool combines data from three sources:
State of Maryland Real Property database: to get a complete list of every property in Baltimore, identified by a block and lot number (this database, unlike #2, allows wildcard searching, which is how we get the complete list). Includes a truncated field listing the owner name. Baltimore City Real Property database: to find the complete owner name and mailing address. Baltimore's Vacant Lots and Vacant Buildings open data sets. The anonymous slumlord watch blogger says that these are not very accurate or up-to-date, but hopefully they are good enough for us to identify who the main offenders are.
I applied a few cleanups and transformations to make the data more useful, and used the excellent Google Refine tool to try and reduce the noise I found in the Owner Name column. Many entities were listed under a variety of spellings, punctuations, and abbreviations, which Google Refine helped me combine. Thanks to Mark Headd for recommending Google Refine to me.

Below you will find a few lists of the top property owners in Baltimore gleaned from these tools.

Important Caveats
Some properties are owned by companies using a series of one-up numbered company names (like "N# Inc." or "NB1 Business Trust", "NB2 Business Trust", etc.  I used Google Refine's clustering feature to combine similar names on the assumption that these are probably controlled by the same people. In the cases where I did this kind of grouping, I used sentence case instead of upper case or I replaced digits with the # sign. Many properties are owned by a uniquely-named LLC (like "1 E. Montgomery LLC"). One person or company could own a significant share of the vacant properties in Baltimore via shell corporations like this. One potential way to get around this is to look up the incorporation paperwork for each company (also available as a scrapeable database), but I'm assuming if you're smart enough to use shell corporations you're probably using a different company to be a registered agent. So this technique would probably only help us identify the main registered agents for the vacant property owners in Baltimore. I haven't done a great deal of authenticating or verifying. All I'm trying to do is make this data more discoverable/explorable. Obviously you should do your own homework before acting on any of this information. I was really surprised to see how much property is controlled by the city. Even if the absolute numbers below are inaccurate the relative amount is pretty amazing. I'd like to see the city take some bold leadership on doing something with all of those buildings and lots. How about a revival of the dollar home program? I only focused on properties listed as non-owner occupied by the State of Maryland. The Slumlord Watch blogger says that the city's vacant building data is inaccurate and not up-to-date, so there may be false positives and negatives in the list.
Largest Vacant Property Owners in Baltimore, Grouped by Name
Owner #  Vacants
Baltimore City 1407
UP# BUSINESS TRUST 38
SS# BUSINESS TRUST 25
JAMES E. CANN 24
NB# Business Trust 24
State of Maryland 19
2008 DRR-ETS, LLC 18
BALTIMORE RETURN FUND, LLC 18
EAST BALTIMORE DEVELOPMENT LLC 18
COMPOUND YIELD PLAY, LLC 17
CE REALTY, LLC. & EPHRAIM WEINGARTEN 16
KONA PROPERTIES, LLC 16
CE REALTY, LLC 15
J.A.M. numbered corporations 15
BALTIMORE PREFERRED PROPERTIES LLC 14
DRUID HEIGHTS COMMUNITY DEVELOPMENT CORPORATION 14
HOLABIRD INVESTMENTS, LLC 14
NEW HORIZON DEVELOPMENT, LLC 14
DOMINION PROPERTIES LLC 13
COMMUNITY SOLUTIONS, LLC 12
M&S JOINT VENTURE DEVELOPMENT CORPORATION 12
MAHS-BE HOLDINGS, LLC 12
BALDWIN TRUSTEE, LEROY 11
HARRISON DEVELOPMENT, LLC 11
HUD 11
CHESAPEAKE HABITAT FOR HUMANITY INC 10
KGB numbered corporations 10
University of Maryland 10
L.A.M.B., INC. 9
REBUILD AMERICA, INC 9
CARTER, NATHAN 8
EQUITY TRUST COMPANY 8
KREISLER, SANFORD 8
LAMB, DERRICK 8
N-#, INC. 8
OAKMONT DESIGN LLC 8
SANDTOWN HABITAT FOR HUMANITY 8
DOMINION RENTALS, LLC 7
GREEN, CARL 7
HARBOUR PORTFOLIO 7
LEO, CAROLINE G. 7
N10 BUSINESS TRUST 7
NEIGHBORHOOD PROPERTIES-4, INC 7
SAUNDERS TERRAINE 7
EAST BALTIMORE DEVELOPMENT, INC 6
APP CONSULTING GROUP, LLC 6
DJ LAND CO, LLC & WODA GROUP LLC 6
EMERALD BAY DEVELOPMENT GROUP & ONE, INC. 6
FIRST NATIONAL DEVELOPMENT, LLC 6
JOHNSON, MARTIN 6

You can also download the entire list of non-owner-occupied vacant building owners in Baltimore.

Largest Vacant Lot Owners in Baltimore, Grouped by Name
Baltimore City 2926
B&D PHASE III, LLC 64
METRO II OLDHAM, LLC & SUNNYS ASSOCIATES, LLC 42
CAMDEN ASSOCIATES, LLC. 40
HARBORVIEW LIMITED PARTNERSHIP NO. # 35
State of Maryland 32
LOWMAN ST.,LLC 31
Oblate Sisters of Providence 27
BG&E 23
COMPANY, LLC & FEDERAL HILL HOLDING & SCC CANYON II, LLC 23
ATLAS MD I SPE, LLC & BB&T BANK (CREO), ATTN: T. GEORG 19
J & J PARTNERSHIP, INC. 19
Baptist Church 18
SANDTOWN HABITAT FOR HUMANITY 18
NANTICOKE INVESTMENT CO., LLC 17
L.A.M.B., INC. 15
CSX TRANSPORTATION, INC. & TAX DEPARTMENT 13
DRUID HEIGHTS COMMUNITY DEVELOPMENT CORPORATION 13
SINGER PARK & PLAY, INCORPORATED 13
STATION PLACE LLC 13
TRIMARK MANAGEMENT 13
ASSOCIATION, INC & MCHENRY POINTE HOMEOWNERS 12
Benedictine Society of Baltimore 12
CHESAPEAKE HABITAT FOR HUMANITY & INC 12
JUBB JR, WALTER H & JUBB, EDWARD H 12
CASTLEWOOD COMMUNITIES, LLC 11
MOUNT SINAI BAPTIST CHURCH & OF BALTIMORE CITY 11
MARYLAND JOCKEY CL 10
CONVENTION AND AUXILIARIES & OF BALTIMORE, INC. & UNITED BAPTIST MISSIONARY 9
DUNN, GREG 9
RIVERSIDE WORK FORCE LLC 9
BALTIMORE URBAN LEAGUE, & INC.,THE 8
C&P TELEPHONE COMP 8
CORPORATE SECRETARY, AMTRAK & NATIONAL RAILROAD & PASSENGER CORPORATION 8
DEVELOPMENT CORPORATION & DRUID HEIGHTS COMMUNITY & JACQUELYN D CORNISH 8
FRP HOLLANDER 95, LLC 8
HOLABIRD PARK APTS. INC 8
MUELLER HOMES, INC. 8
NEWSTAR DEVELOPMENT AT CANTON & PEAKS, LLC 8
SCARFIELD SR, FRANK D 8
THE KCR DEVELOPMENT GROUP & SPICER'S RUN HOMEOWNER ASSOCIATION
BALTIMORE SCRAP CORP. 7
BRIGHTON DEVELOPMENT GROUP & LLC 7
CHURCH, THE & VESTRY OF MOUNT CALVARY 7
FLAG HOUSE RENTAL I, L.P. & METRO PLAZA II 7
FOWLKES, ROBIN 7
PARADIGM BUILDERS, LLC & RICHARD MIRSKY - OFFIT KURMAN 7
URBAN HEALTH INSTITUTE OF & WASHINGTON, THE 7
CANN JAMES E 6
CHURCH OF THE REDEEMED OF THE & LORD, INC, THE 6

You can also download the entire list of non-owner-occupied vacant lot owners in Baltimore.

The Raw Data
All data used to create the above table can be downloaded from Github, including the raw CSV data.

The Code

It's creative-commons licensed and posted on Github. It's pretty raw and unfactored. I ran it all from irb. It needs to be converted into a Rake task or other command-line friendly, totally-automated package.

Next Steps
We could get this up and running on ScraperWiki to have the data constantly updated. We could run an Amazon Mechanical Turk project to create an up-to-the-minute database of vacant houses in Baltimore, using Google Street View. We could just ask each worker to use street view to make an estimate of whether the house was vacant or not. I'm sure there would be some inaccuracy but the data ought to be good enough to help further investigations.

Friday, December 16, 2011

The only social media advice you need

I keep thinking about How To Be Interesting, a 2006 blog post I read a few months ago. Russell Davies captured the essence of my social media strategy, what social media means to me, and why it's been so successful for me and others. If anyone ever asks me for advice in this arena, I am going to quote Davies:
...the core skill of any future creative business person will be 'being interesting'. People will employ and want to work with (and want to be with) interesting people.
Social media is a big deal because it helps you do the two things Davies recommends to cultivate that skill of "being interesting":

The way to be interesting is to be interested. You’ve got to find what’s interesting in everything, you’ve got to be good at noticing things, you’ve got to be good at listening. If you find people (and things) interesting, they’ll find you interesting. 
Interesting people are good at sharing. You can’t be interested in someone who won’t tell you anything. Being good at sharing is not the same as talking and talking and talking. It means you share your ideas, you let people play with them and you’re good at talking about them without having to talk about yourself.
It's not rocket science: social media helps you find more interesting things (such as when I found this article via Hacker News) and share them (like I'm doing with this post).


You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser