RSS

Pinboard Blog

« earlier later »

An Interview With Alex Bayley of Growstuff

Last year I chose Alex "Skud" Bayley as one of the six winners of the Pinboard Co-Prosperity Cloud, a prestigious $37 investment in six businesses that wanted to join me in thumbing their noses at Silicon Valley flapdoodle and just build something cool and mildly profitable on their own.

Alex's effort, Growstuff, is a community site for food gardeners. I think gardening is a terrific metaphor for maintaining a solo web project; there are dozens of small tasks that need frequent attention, or the site soon becomes unmanageable and withers. So I was curious to see what a community website around gardening would look like, and whether Skud could divert some of the energy from her industrious users to make her development efforts a little easier.

Skud had committed to open sourcing the site and making the development and planning process for it transparent, decisions that chilled me to the very marrow, but which she considered crucial to the success of her project. Reasoning that I had the most to learn from people who took a radically different approach, I forked over the $37 and sat back to see what would happen.

Several weeks ago, I got to meet Skud here in San Francisco, ate fancy little sandwiches with her, and got to hear a lot about her website. Now that Growstuff has launched, Skud has kindly agreed to share some of our conversation as an email interview:

MC: Congratulations on your public launch! What's next for the project in the months to come?

AB: Thank you! Our public launch was just the beginning. Our initial goal was to build something with basic functionality, that you could use to track your garden and which would also the potential of the site in terms of aggregating data and making it available both on the website and via an API. That's what we launched in June. It's only a small part of what we hope to build, though. We have heaps more to do, both in terms of user-facing functionality and in terms of our open data and API.

In July, I surveyed and interviewed a number of our members and we developed a roadmap for what's next. We had so many things on our development wishlist that we had to whittle them down somehow. In the end we came up with a list of things we want to work on that we think is achievable and that'll make the most people happy. It's posted on our wiki, but in short we want to improve the crop database; build ways to track your seed stash, harvests, and wishlist; work on a "1.0" API (as opposed to the current "version 0" we have); and a bunch of other little stuff.

MC: How did you spend my $37 investment?

AB: At that time, I was running Growstuff as a hobby project, and hadn't incorporated or anything. My only real expenses were web hosting, so it went to that. Which I guess is pretty much what you expected. Or another way of looking at it is that it paid for my groceries. When you have a hobby project and just one personal bank account, it's all a bit unclear. Since then I've incorporated Growstuff and the expenses have increased. I actually posted about our first few months' finances on the Growstuff blog. I found incredibly useful to see other small tech businesses' financial spreadsheets and breakdowns when I was just starting out, so I thought I'd pay it forward.

MC: How many garderners do you have going now? Is it following the usual pattern of a few very heavy users, and a larger number of occasional ones?

AB: Around 400 members at present. And yes, the usual activity distribution seems to apply, though the numbers are small at present and there is a pretty strong/active community of early adopters, so it's heavier on the "active" end than I'd expect to see when we have 400,000 members.

MC: You have a fairly extensive roadmap on the site, and it seems like you're amending it based on user feedback. What's the biggest divergence so far between stuff you planned and what has actually happened?

AB: Very little, really! We just passed our one year anniversary of the original idea of Growstuff, and I marked the occasion by revisiting the blog post I made back in July 2012. What I described there is basically what we've built (and are building). I think the vision of "Ravelry for food gardens" was pretty clear in the first place. It's only in the details that things have changed: we realised we needed to have a hierarchy of crops to note all the varieties of different things, we didn't manage to get international/alternative names for crops into the system as quickly as we would have liked, etc. But the overall bones of it are still what was originally talked about.

MC: When we spoke in San Francisco, it sounded like the open source part of the project had been a success, with a variety of people committing code. What's your secret to getting and keeping active contributors?

AB: Yup! We have over 100 people on our "discuss" mailing list where most of the dev work takes place. A lot of people join us because we advertise that we do pair programming and welcome non-experts. I suspect people stick around because it's a warm-fuzzy sort of project: you feel like you're doing something good and the people are friendly and appreciative. Plus a lot of people seem to just want something like Growstuff to exist, either for themselves or for people close to them. I hear a lot of people saying that their family members would be really into a site like Growstuff, and that seems to give them a really grounded sense of building something useful.

MC: Money! How do you make it, and is it enough to live on?

AS: Through paid memberships, which get you special features on the site (i.e. "freemium" model). At present these aren't enough to live on, but I'm also involved in an Australian government program for small businesses, which pays me about $1100 a month to work on this for 1 year. Between that and savings I'm doing okay for now, while Growstuff memberships are paying Growstuff's other expenses (including hiring a graphic designer, getting a lawyer to review our TOS, and stuff like that). I'll really need to start paying myself sometime in the next year, though, so we're going to have to build membership and make sure the paid accounts are compelling enough to get lots of people to subscribe and support the site and, ultimately, pay my rent.

MC: Is it true that Australia has a bizarre, non-cyclical pattern to its growing seasons, or is that just Yankee propaganda?

AB: Sort of true! There are at least four things that make Australian weather seem strange to northern-hemisphere people. The first, of course, is that the southern hemisphere is offset by 6 months due to the earth's axial tilt: our side of the world is closest to the sun around December-January. That sounds obvious, but wasn't so for the first white colonists here, who took a while to get used to it.

The second thing is that even in the temperate parts of Australia (including Melbourne, where I am) the seasons don't follow summer-autumn-winter-spring in quite the same way that eg. Europe does. For instance, summer for us is incredibly dry and hot, which means that it's *not* a peak growing time, unless you use a lot of artificial irrigation. Our lawns, for instance, tend to go through a growth spurt as the weather cools down. Here's a more realistic chart of seasonal trends in my area.

Then, thirdly, lots of Australia is in the tropics, which means that instead of having "summer" and "winter" it's just hot all the time and you have "wet" and "dry" seasons. And finally, Australia is very strongly affected by the El Niño/La Niña weather patterns, which have to do with slow-moving currents in the Pacific Ocean and involve multi-year fluctuations. So we often have a few dry years followed by a few wet years. When you hear about droughts and/or floods in Australia, it's often to do with this. So, all in all, it's complicated. And don't even get me started on our soil quality, which adds a whole new level of complexity for Australian gardeners trying to deal with growing traditions that were developed in Northern Europe. Ugh.

MC: Does the fact that gardening is a seasonal activity affect the rhythm of your work on the site?

AB: Not much, so far, though I expect that when our membership gets bigger we'll see distinct seasonality in our membership and activity. Northern hemisphere spring/summer is likely to be the busiest time.

MC: You've worked in the thick of things in the Bay Area, as well as back home. Would you advise people following your footsteps to get as far away from computerland as possible, or make an effort to be where other coders are?

AB: This is such a complicated question for me. I've had a very up-and-down experience in the tech industry, and the Bay Area had some of the highest "ups" and the lowest "downs". For me, being around heaps of incredibly smart people was great, but being around people intent on making bucketloads money at the expense of other peoples' wellbeing was horrible. I had mixed feelings about leaving, but I think it was the right choice. However, getting away from the Bay Area doesn't mean I got away from coders, and "computerland", if it's anywhere, is everywhere. I'm still working with coders every day, through in-person and remote pairing sessions and online discussions, but now I'm working with a different set of coders, mostly outside the Bay Area bubble.

As far as what I'd advise others: think hard about your values and act according to them. If your values match those of the Bay Area tech community, and you're able to do so, then by all means work there. If they don't, then don't.

MC: What can we do to encourage other people to start small, self-sustaining businesses?

AB: It seems like we have pretty strong informal networks to support each other, and I've had heaps of advice from you and from other indie tech founders. At the same time, I've found more formal groups (mailing lists, industry associations, meetups, etc) to be largely useless. I have no idea why this is, but I wish we could fix it.

I mentioned I've been involved in an Australian government program for small businesses. They weren't very tech-savvy, but as long as you ignored the tech side of things, they actually provided a lot of the information and training I needed when it comes to running a business: taxes, incorporation, planning, marketing, all that stuff. I'd love to see more of this.

Apart from that: I think we just have to keep doing what we're doing, and keep talking about why (and how) we're doing it. I think of indie tech businesses as being like farmers markets: there are lots of reasons to be unhappy with the systems of industrial food (or tech) production, but we can provide a meaningful alternative, and I think we have a growing community of people who appreciate that.

MC: Thank you so much for your time!

If you have a garden, big or small, aspire to have one, or just want to hack on some code in hopes of scoring some local vegetables, go check out Growstuff!

—maciej on August 26, 2013



Thoughts on Colocation

After a week of slogging servers around northern California, I thought a brain dump on colocation might be useful to readers, and to future me.

I wrote about the difference between colocation, leased servers and other kinds of hosting in an earlier post. This one is strictly about colocation, the 'Condo' approach where you own a bunch of hardware and need a place to put it.

What you are after is not complicated:

  • A physical enclosure
  • Some kind of Internet connection
  • Power
  • Security guards
However, buying it is a pain in the neck.

Money

It's typical to sign a one- or three-year contract. Right off the bat this introduces an element of pressure, since you're making a fairly binding, long-term decision without knowing what you're doing. Unless you live in the Bay Area, colocation is a cost on par with your rent or mortgage.

Worse, I've found that the cost for identical configurations can vary by a factor of three or more. You really do need to shop around. And you need to be especially careful of punitive terms for things like overstepping bandwidth or power requirements. These are things you have to dig out of the fine print of contracts at a moment when you just want the whole process to be over.

Salespeople

Renting colo space feels like buying a car. You typically decide on a specific configuration you want, and then ask for quotes. For the salespeople involved, this is just the start of a long conversation they want to have with you. They'll be very curious about your budget, and want to talk about the hosting equivalent of an underbody clearcoat (various "hybrid cloud solutions") and extended warranty.

Although colo space is a commodity, salespeople become tetchy if you treat it as such. They will insist on talking to you over the phone and bristle at the suggestion that their job could be replaced by a web form. It is a good idea not to think about how much their salary or commission adds to your costs.

The reason I say colo space is a commodity is not because all facilities are the same, but because small-time clients will have no practical way of assessing their quality. There are certainly some facilities that are obviously bad, but most data centers have sane policies, look nice if you visit, and talk eloquently about uptime. The only way to evaluate a data center is to go through a series of small and big outages together, but by then you're already in a multi-year contract. So in practice, there is not enough information to pick a "better" data center, just a bunch of anecdotes on message boards. I believe you're better off getting space in two cheap places than trying to pick one high-end one.

Resellers

A lot of colo space is resold through intermediaries. This is often the only way to get a smaller amount of space than a full cabinet. Someone rents a bunch of cabinets at a data center and then parcels them out by the slice to clients, in pieces as small as 1u.

There are two things to watch out for in this arrangement:

  1. A bunch of people will have access to your physical equipment. Good resellers will take pains to introduce the various people in a cabinet to one another (or at least provide contact info).
  2. If your provider gets in a dispute with the data center, you may not be able to physically take out your hardware. It may even be seized without advance warning.

Power

Power is the great bugbear in hosting. You need to know how much your equipment uses, and how much you're likely to need. It is also often the limiting factor.

A useful rule of thumb in my case has been 1 rack unit = 1 amp. However, it is quite difficult to estimate the power consumption of a server before you buy it. You end up having to plug it in to a Kill-A-Watt meter under normal load.

A full cabinet is somewhere around 42 units, but a typical full cabinet power allotment is 20 amps. So you can't just fill a cabinet with servers.

The power situation is even worse than it sounds. That 20 amp figure is strictly theoretical, since you aren't allowed to use the full amount. You are limited to 80% of this figure, so a "20A" cabinet has 16A usable power, enough for eight pinboard-style servers.

Finding Datacenters

Finding information is another pain in the neck. The place of choice is an awful forum called WebHosting Talk. There is an open business opportunity to anyone who can stick a front-end interface on this site that lets you enter number of servers, bandwidth, physical location, and spit out a list of offers.

Another business opportunity is to make an authoritative directory of Bay Area data centers, since there is a bewildering assortment of sellers, re-sellers, and re-re-sellers offering the same physical space. Conversely, some large providers maintain multiple facilities.

Earthquakes

Nobody likes to talk about earthquakes. But anything in the Bay Area or Seattle is going to come crashing down at some point. Another thing that has proven impossible is finding out what facilities are at highest risk. It's one thing to go offline when the Big One comes (half the Internet will be down with you). But losing a rack full of hardware into the maw of the earth is worse.

So here's my plea to hackers: figure out where stuff is physically hosted, correlate it with seismic hazard maps, and make a nice web form that lets people shop for specific power/bandwidth/space configurations without talking to salespeople. Charge money for it! I will pay you! Others will pay you!

—maciej on August 03, 2013



Seeking Bay Area Colo Space

I succesfully moved my backup servers to Sacramento this week, but I'm still looking for a Bay Area colo for the main site, which has outgrown its current home.

Here's what I'm hoping to find:

  • Half or full cabinet
  • 100 Mbps capped bandwidth
  • 20 A @120 V
  • One or three-year contract
  • Full 24/7 physical access, within 100 km of San Francisco

The best offer I have in hand right now is from HE at their Fremont 2 facility, who are asking $600 per month for a full cabinet, but with skimpy power (15A) and a $200 setup fee to install square-hole posts.

Make me a better offer and I'm yours! Email me at maciej@ceglowski.com.

—maciej on August 02, 2013



Upcoming API Downtime

I've found out on short notice that I must vacate my current hosting facility before the end of the month. This will mean physically moving about eighty pounds of PInboard machinery from San Jose to a new home in Sacramento.

The servers involved run the Pinboard API. I'm going to try switching all API traffic to the web server (which is hosted elsewhere and will not be affected by the move) but if it turns out to be too much load, I will need to take the API offline.

In the pessimistic case, the API will be down from early Monday evening California time until Wednesday morning.

The outage will also affect archiving. About half of Pinboard users won't be able to reach their archives during the outage. I will extend affected archiving accounts by one week as compensation.

The website and RSS feeds will remain up and running.

Nothing makes me feel more alive than a midsummer motor car ride through California's gorgeous Central Valley. I apologize to my users for this self-indulgence, and promise not to make it a regular habit.

—maciej on July 28, 2013



Pinboard is Four Years Old

Today marks four years since I opened the creaky gates and started charging customers money for Pinboard.

Here are some site stats for this year, compared with one and two years ago:

2010 2011 2012 2013
bookmarks 3.5 M 27 M 53 M 76 M
tags 11 M 76 M 135 M 178 M
active users 2.8 K 16 K 23 K 23 K
bytes archived 200 G 3.0 T 5.9 T 8.8 T
downtime 6 h 29 h 22 h 12 h*
unique URLs 2.5 M 16 M 32 M 48 M

* this is site downtime; API downtime was much worse, perhaps 48 hours in all

The site has continued to grow at a steady clip, adding as many bookmarks and tags this year as last year. Total revenue from signups and archiving has been far steadier than I expected from a web project, which by nature tends to be spiky. This comes as a considerable relief, since it means I don't have to hunt for a new brand of champagne or truffle oil every other month. The number of active users has found a steady state, with as many people joining the site as dropping off of it in any given month. Before growing the site much further, I would like to get better at handling support requests at this level.

It's been an active year behind the scenes. On top of the usual code gardening, I spent some time working to better secure the site, introducing API tokens for password-free authentication, moving everyone to TLS without breaking everything, and adding various cookie flags and HTTP headers to make the site a little more resilient to bad people in public places.

The major new features this year were tag bundles, privacy lock, major improvements to search, and the beginnings of a bulk tag editor.

I gave talks on Pinboard at Brooklyn Beta, CUSEC, and InfoShare Gdansk, and in the process got to meet Pinboard users in Osaka, Warsaw, Paris, Lyon, Stockholm, London and Berlin. This was an enormous amount of fun for me, and really helped to get me out of the house.

Finally, I made my first foray into venture capital with the Pinboard Investment Co-Prosperity Cloud. The six winners have been hard at work, and I look forward to writing about what they've achieved in the coming weeks.

Thanks to everyone who has helped me with this project over the past four years, in big and small ways. July 9 is one of the happiest days of my year, and I owe it to all to kind people from across the Internet.

—maciej on July 09, 2013



Persuading David Simon

[June 19 update: David Simon has been kind enough to respond at length here. He points out that I falsely stated that collecting call records requires a warrant; I have corrected that statement in the post below.]

I read with interest David Simon's recent blog post in which he responds to revelations that the NSA has been collecting the call records of all American mobile phone users.

David Simon, of course, created the Wire, a television series where institutions take on lives of their own and defy attempts by well-meaning people to reform them from within. So it came as a real shock to find Simon criticizing pundits who have objected to the extent of NSA surveillance, and accusing them of wilful ignorance about the nature of police work.

Mr. Simon pointed out that law enforcement agencies have been allowed to capture call records for decades, including in cases where the information harvested includes calls from people who are not under suspicion. In other words, there's nothing new going on to get worked up about.

Having labored as a police reporter in the days before the Patriot Act, I can assure all there has always been a stage before the wiretap, a preliminary process involving the capture, retention and analysis of raw data. It has been so for decades now in this country. The only thing new here, from a legal standpoint, is the scale on which the FBI and NSA are apparently attempting to cull anti-terrorism leads from that data. But the legal and moral principles? Same old stuff.

Seeing no difference in principle, only a difference in degree, in the NSA's surveillance program, Simon expresses annoyance with Americans who demand total protection from terrorism and then purport to be shocked when their government takes their requests seriously.

Mr. Simon cites the specific example of an investigation he covered as a police reporter in Baltimore in the 1980's. Criminals were using pay phones and pagers to evade detection, and tracking them down required indiscriminately recording numbers dialed from those pay phones, with the goal of sifting through the data later to find the pager numbers.

He argues that this kind of investigation, which targeted pay phones, was in some ways more invasive than the kind of tracking the NSA is accused of, since people expect to be anonymous when using a pay phone in a way that doesn't apply when they're calling from their own cell.

There is certainly a public expectation of privacy when you pick up a pay phone on the streets of Baltimore, is there not? And certainly, the detectives knew that many, many Baltimoreans were using those pay phones for legitimate telephonic communication. Yet, a city judge had no problem allowing them to place dialed-number recorders on as many pay phones as they felt the need to monitor, knowing that every single number dialed to or from those phones would be captured. So authorized, detectives gleaned the numbers of digital pagers and they began monitoring the incoming digitized numbers on those pagers — even though they had yet to learn to whom those pagers belonged. The judges were okay with that, too, and signed another order allowing the suspect pagers to be “cloned” by detectives, even though in some cases the suspect in possession of the pager was not yet positively identified.

I think Simon's fundamental argument, “same old stuff”, is mistaken in a number of important ways, and that some of this reflects our failure as technologists to communicate what modern surveillance can do.

First, there is the scope of the order. The Baltimore operation, and others like it, were limited to a specific criminal investigation. They were obtained under a warrant under a subpoena setting limits on what would be collected, and for how long.

The NSA program is universal and appears to be open-ended. Information is collected in aggregate. The program operates under the authority of secret court order, not a warrant. It is not clear whether the Administration even believes this type of surveillance requires a court order.

Second is the nature of the body carrying out the surveillance. In Simon's example, this was a municipal police force, overseen by a local court.

In our case, it's the NSA, a Federal agency whose job has traditionally been to collect foreign signals intelligence . The operation is overseen by a secret court system called FISC.

Third is the nature of the data being collected. When the Baltimore investigation took place, it collected a simple list of telephone numbers dialed from the monitored phones.

Modern call records contain much more data, reflecting the fact that almost all of us carry cell phones. A call record now includes unique device identifiers, routing information, cell tower IDs, and a wealth of additional information about the circumstances and location of the call. The location data is particularly powerful, turning mobile phones into de facto tracking devices whenever they are turned on.

Fourth is the question of oversight. The evidence used in the Balitmore case was collected by municipal police and presented (I'm assuming) in open court. Those against whom it was used had the chance to mount a defense, appeal the verdict to state and Federal courts, and enjoyed the presumption of innocence guaranteed to them by the Constitution.

The NSA call data is collected and used in secret. The agency is overseen as part of the very large national security establishment by a small, overworked group of legislators and senior government officials who have the requisite security clearance.

So I contend that the parallel Simon makes is false. The NSA is not a law enforcement organization, and intuitions from police reporting don't carry over.

But even if we grant the analogy, I think there's a more dangerous argument in Simon's essay, which is the contention that two programs that differ only in degree are necessarily "the same old thing". I believe this is not a safe assumption to make when talking about computers and their use in domestic surveillance.

In the portion of his essay that excited the most comment, Simon appears to express disbelief that the NSA can make broad use of the data it gathers:

When the government grabs every single fucking telephone call made from the United States over a period of months and years, it is not a prelude to monitoring anything in particular. Why not? Because that is tens of billions of phone calls and for the love of god, how many agents do you think the FBI has? How many computer-runs do you think the NSA can do — and then specifically analyze and assess each result?

Well, of course, the answer is "you would not believe how many 'computer-runs' the NSA can do". I believe this part of the essay especially caught tech people's attention, since it suggested that Simon might be naive about the capabilities of a modern datacenter. It's certainly the part Clay Shirky pounced on in his rebuttal.

But Simon is not a fogey who doesn't understand how powerful computers have become (though I feel that there are such people in positions of oversight in the House and Senate). I believe his error is in assuming that the analysis of these 'computer-runs' is any kind of bottleneck. There are powerful techniques for surfacing interesting features in any comprehensive list of interactions between human beings. I've written in the past about my distaste for the 'social graph' and the perverse worldview it imposes on our projects, but part of the appeal of that worldview is the real power of mathematics applied to exactly this kind of data. The analysis can be automated, and no good comes of it.

In a beautiful worked example, Kieran Healey has shown how a precocious British intelligence service could have identified Paul Revere as a person of particular interest based only on a set of membership lists of organizations he belonged to.

The point is, you don't need human investigators to find leads, you can have the algorithms do it. They will find people of interest, assemble the watch lists, and flag whomever you like for further tracking. And since the number of actual terrorists is very, very, very small, the output of these algorithms will consist overwhelmingly of false positives.

It's at this point that Simon's logic starts to work in the other direction. Given a long list of potential leads, investigators are going to focus on vetting the most likely, rather than taking any steps to clear false positives out of system. The penalty for missing a real terrorist is catastrophic, while the penalty for falsely accusing someone (when not only the accusation, but the very existence of the program, is secret) is nonexistent, even if the secret accusation ends up doing real harm. Limits on manpower won't constrain the investigation; they will only reduce its overall quality.

This isn't an abstract argument. We are all familiar with the tenebrous no fly list, a document that prevents several thousand people from traveling by air, and condemns thousands more to intrusive security measures each time they want to get on a plane. After 2001, this list rapidly expanded to thousands of names, with no avenues of appeal and no way to even check whether your name appeared in the document, to the point where the government finally had to improvise a 'redress' policy for travelers who found themselves living out a Kafka novel.

Characteristically, proposals for fixing the no-fly list and similar watch lists now call for collecting even more information, to help disambiguate people who share a name but not a date of birth with someone on the watch list. The basic problem—that lists of suspects are generated without accountability, without oversight, and with no incentive to avoid mistakes—persists.

There's also a more dangerous institutional problem to consider. When a system like this exists, it creates pressure for its own use. What is the point, after all, of having a very elaborate, extremely expensive database if you are only ever going to use it in exceptional cases? It is the nature of law enforcement to want to go after bad guys with all available tools. We saw a vivid demonstration of this in the years after the 2001 attacks, when the administration attempted to blur the lines between the 'War on Drugs' and the 'War on Terror', arguing that the proceeds from narcotics sales paid for terrorism.

Consider, too, a technique that has become standard in Federal investigations. It is a felony to make false statements to a Federal agent, and investigators routinely make use of this fact to gain leverage over a witness or suspect. People tend to be nervous when they talk to police, and unless they know better are liable to give inconsistent answers during questioning. Good interrogators can convert each of these inconsistencies into a felony count. Imagine how much more potent this tactic becomes when investigators can gain access to a database of your movements and contacts for the past decade.

The security state operates as a ratchet. Once you click in a new level of surveillance or intrusiveness, it becomes the new baseline. What was unthinkable yesterday becomes permissible in exceptional cases today, and routine tomorrow. The people who run the American security apparatus are in the overwhelming majority diligent people with a deep concern for civil liberties. But their job is to find creative ways to collect information. And they work within an institution that, because of its secrecy, is fundamentally inimical to democracy and to a free society.

I can't believe that David Simon, of all people, doesn't see the danger inherent in a permanent domestic surveillance program. I doubt that he would support a government initiative for all Americans to wear tracking devices in the name of fighting terrorism. Yet the NSA data collection program, whose output is functionally identical, seems not to trip the same alarm bells with him.

-:-

In public statements, the NSA director has defended domestic surveillance as a vital tool in preventing terrorism.

The term 'terrorism' is a magic word, unlocking government powers we normally associate with wartime. The current and previous Administration have, at various times, asserted the right of the government to conduct invasive and open-ended surveillance on people it suspects of terrorism, detain suspects in terrorism cases indefinitely without trial, 'render' them to countries for interrogation and torture, kill people it considers terrorists, including American citizens, with giant flying robots, or keep such people alive against their own will.

This is total power over human life. The authorities assure us that numerous checks exist to prevent abuses of this power, but of course the checks are also classified. The government is promising that the secret police won't put innocent people in the secret prisons because the secret courts would never allow it.

This system puts enormous pressure on a small group of fallible human beings. For the secrecy to work, the number of people in on the secret must be small . But this group is all part of the same hierarchy, subject to the same pressures, and unable to communicate its concerns outside the same closed circle.

Talk of secret prisons, indefinite detention, and force-feeding can sound tendentious (though it's all uncontested public record!). Americans have a deep faith in the rule of law and have not proven receptive to the argument that truly innocent people will find themselves placed in the "terrorist" category by accident.

There is a tendency among those who grew up under the rule of law to treat it like the Rock of Ages, an immovable substrate in which all the institutions of the state are forever anchored. And so even ordinarily skeptical people tend to assume that the government obeys its own laws when no one is looking. To an astonishing extent, and to the great credit of American civic life, this is actually true.

But I think a better metaphor for the rule of law is that it is the soil in which democratic institutions take root. Like the soil, it can be depleted. And once depleted, it is not easily replenished.

Secrecy erodes the rule of law because it makes democratic accountability impossible. Secrets can't be held too broadly, so secrecy concentrates responsibility and asks too much of human nature. That is why every intelligence agency, unless given rigorous outside oversight, commits terrible excesses.

I think Simon agrees about the perniciousness of this secrecy. In a later rebuttal he's called for a modern-day version of the Church Committee, a group of people from outside the security establishment with top-secret access and the power to compel testimony.

And I agree with Simon that the current state of affairs is the "inevitable consequence of legislation that we drafted and passed."

American politics since the Cold War has operated under the conceit that national security must transcend partisan differences. And so we have seen large bipartisan majorities voting for pre-emptive war and domestic surveillance even though both of those policies were highly controversial outside Congress.

This tradition has created a vast space beyond political accountability. When both political parties pursue a nearly identical policy, there are no electoral consequences when the policy proves disastrously wrong. Who do you vote against?

People have good intuitions about the danger of indiscriminate collection and retention of their data. They're not being hysterical. For the last decade, we've been concentrating on how to regulate the way this data gets used in the private sector. But now that the coercive power of the state has entered the picture, the stakes are much higher, and we have an opportunity to politicize the debate. David Simon tells us to resign ourselves to the consequences of technological change:

"The question is not should the resulting data exist. It does. And it forever will, to a greater and greater extent."

But I think that is wrong. Whether the data should exist, and for how long, is exactly the question. The answer is not a technological inevitability, but a political choice.

I believe a world in which everything is recorded and persists forever carries the seeds of something monstrous . It is in the nature of computer systems to remember things indefinitely, but there's nothing difficult about programming machines to forget. It just requires laws to do it. We can't treat it as a technical problem. And to get the laws passed, we need to politicize the issue.

Still, these barricades are going to seem awfully lonely if we can't even get David Simon up there with us. The man should be a natural ally, and the fact that he sounds so exasperated troubles me. The fact that he seems resigned to a future of total information retention troubles me. The fact that we are talking past each other troubles me most of all.


Simon also mentions the FBI, but it's unclear to me that this agency has anything to do with the accusations of widespread call monitoring.

The expensive part is keeping everything secret, and staffing it with people cleared for such access. The database itself is likely quite modest in size.

The Washington Post has estimated the number of people with Top Secret clearance at 854,000. The number of people with full knowledge of all secret programs is much smaller, as this information is carefully compartmentalized.

Except Pinboard archives. Those are great!

—maciej on June 15, 2013



Berlin Meetup Aftermath

Our meetup in Berlin yesterday proved to be the best-attended one yet, with fifteen Pinboard users and one extraordinarily patient child fighting their introvert instincts on a beautiful spring day.

We had a diverse group of people from across the US and Europe, ranging from designers and web developers, to game developers and even a mobile app developer! Luckily everyone found a common language.

In order not to drink beer on an empty stomach, I stopped in a Bavarian restaurant before the meetup and ordered the item below, about which I would like to make the following observations:

  1. Wrapping this thing in bacon would actually make it healthier.
  2. It appears to be made from the heart of the last person to order it.
  3. Clicking 'enhance' on this photo crashes iPhoto and turns its icon into a ham hock.
  4. It's hard to see, but the parsley is hovering a few millimeters above the meat field.
  5. If I ever visit Bavaria, back up your bookmarks.

I sincerely thank everyone for coming and hope to see you in Berlin again soon!

—maciej on June 09, 2013



Berlin Meetup

I'll be in Berlin on Saturday, June 8, and invite all Pinboard users in the area to meet me at the PraterGarten at 16:00 for sausages and light chat.

Some people have asked what goes on at these Pinboard meetups. Topics typically discussed are the local beer, whether to order more of it, and who everyone is and how they are living their lives so far.

I try to avoid computer talk unless it's clearly necessary. Of course, if there are bugs or flaws in Pinboard that you find particularly oppressive, this is the perfect chance to hold me accountable.

Please RSVP if you would like to come so I have a rough idea of head count and warn you if plans change.

—maciej on May 31, 2013



Stockholm Meetup

Warm thanks to Massimo, Stella and Erik for coming out on short notice for a nice Pinboard lunch in Stockholm. That makes two librarians and a complexity theorist, which seems like a representative sample of users.

It's a big treat for me to meet Pinboard people while I travel; it not only makes the world feel like a welcoming place, but also helps the whole project less abstract.

The lunch brought with it a heavy blow - the kitchen was out of the famous kötteboller, or Swedish meatballs—but my new friends helped me cope.

—maciej on May 30, 2013



Seeking a Summer Pintern

I'm looking for someone to work with me on Pinboard from June 15 to September 15, 2013.

This is a remote position. You'll set your own hours and coordinate with me online.

The pinternship pays a modest stipend of 6,000 USD. More valuable is the chance to develop your skills as a generalist web developer. ‘Generalist’ may not sound exciting, but it's actually a rare person who understands a production web app from top to bottom.

You'll spend three months learning every aspect of running Pinboard, a reasonably complex website with about 20,000 active users. Each layer of the ‘web app stack’ will come to feel like a close and trusted friend. Then that friend will betray you, and you'll practice extracting the knife from your back as you hide your tears from an uncaring world.

This skill, once properly developed, will prove useful in landing high-paying computer jobs.

What You'll Do

Part of your job will be to help me with the day-to-day operation of the site, including troubleshooting and finding ways to automate time-consuming work.

The rest of the time you'll spend on a series of projects in areas of your choice. We'll spec out the projects together, you'll build them, and they'll go live on the site. Then we will scramble to fix them.

Here are some examples of things you could work on:

  • Care and feeding of a production MySQL database
  • Caching, including memcache, varnish and pound.
  • Helping me implement version 2 of the Pinboard API.
  • Hacking on the Sphinx-based search engine.
  • Writing a better web crawler.
  • Making improvements to the job queue and scheduling system.
  • Writing tools for text parsing and analysis.
  • Machine learning and classification.
  • Building custom UI components in Javascript.
  • Writing browser plugins for Chrome, Firefox, or Safari.
  • Finding security holes and patching them.
  • Deployment scripts and emergency checklists
  • Improving server monitoring tools

What you work on will depend in large part on what you want to learn.

Over the course of the summer, you'll have a chance to get intimately familiar with the different components of a modern web application (hardware, operating system, database, network, application, and cache) and how they fail to work together.

You'll have the satisfaction of building things that benefit real people.

Ideally, you'll gain useful experience and earn a small amount of money while not completely destroying my livelihood.

Requirements

I don't care where you live, but you must be at least eighteen.

You must be highly autonomous and good at muddling through problems. If you are easily frustrated, you will not enjoy working on Pinboard.

You should know your way around a Linux system and be proficient in at least one programming language.

You have to be somewhere where I can meet with you in person before June 15. This means Paris, Warsaw, Gdańsk, or the San Francisco Bay Area.

You should have a strong work ethic and lots of enthusiasm. Enough for two people.

How To Apply

Send me a link to a webpage with the following info:

  • Who you are, and where you live.

  • What you're good at already.

  • What you'd like to become good at this summer.

  • A feature that you'd like to see added to Pinboard.

  • Any project you've worked on that you're particularly proud of.

  • For super extra credit, your solution to problem #17 in the Matasano crypto challenges.

The webpage should be served over TLS and include a custom HTTP header called 'Pinternship', with its value set to an emoticon of your choice.

If I really like your application, I will ask you to give me a couple of personal references. I'll also ask you to meet me for an in-person interview.

Thank you!

—maciej on April 28, 2013



« earlier later »

Pinboard is a bookmarking site and personal archive with an emphasis on speed over socializing.

This is the Pinboard developer blog, where I announce features and share news.




How To Reach Help

Send bug reports to bugs@pinboard.in

Talk to me on Twitter

Post to the discussion group at pinboard-dev

Or find me on IRC: #pinboard at freenode.net