The Worsening Fragmentation of the eBook Market

On my way to the office this morning my bag seemed especially heavy, the natural effect of stuffing an attaché with a MacBook Pro, iPad, iPhone and Kindle. I felt silly feeling it necessary to keep all these electronic gismos simultaneously latched onto my shoulder within seconds reach of my left hand, each ready to perform some specific task that required firing on its individual display and taking a few milliamp hours off its individual lithium ion battery pack.

Each of these devices is especially good at performing certain types of tasks, to the point that it also feels silly to not use the tool best suited to the job.  To a computer scientist all four of these are technically Turing machines–more commonly known as “computers”–but each has its own practical strength and weaknesses. And while carrying a single device solely by itself one becomes incredibly mobile, taking all four is not. I’m like a sleep-deprived mother of quadruplets sluggishly pushing a custom designed stroller through the grocery store. The monstrousity of brushed metal widgets, cables and wall warts I’m toting reminds me of that fictional car designed by Homer Simpson.

But such are the pro and cons of appliance computing. Not all of these hardware devices are technically needed on this particular Wednesday, but the combination of specialized functions provided by the union allows me a more productive day. I could have left at least one at home, provided that I had a reasonable amount of interoperability between them to shuffle data.

Stop. Oh god. I saw this coming the second Amazon announced they would use their own locked-down format (.azw/.mobi) for eBooks purchased through their store. (Aside: If you’re interested in Kindle encryption you may eventually find yourself at my KindleTools site for finding PIDs.) My biggest of fear with regards to the emerging ebook market is now in full swing. Not only are there subtle, often incompatible (and proprietary) differences in ebook data between reading application software, but most of the time I can’t even legally attempt it. It’s like Microsoft Office vs. Word Perfect vs Lotus Notes vs The People of Earth all over again.

Each content retailer is trying to be the de facto digital ebook data locker for the entire market, and the folks at the top of the food chain–most notably Amazon–have no business interest in supporting standardized (or at least conventionalized) data interchange with less popular consumer applications and devices. But why would they? If they can provide the content and the software and the hardware with a majority of the market, why not do everything possible to lock consumers into the monopoly? Here’s a painstakingly detailed scientific visualization of the current eBook market:

Amazon's view of the eBook market.

Let me make this clear: I am no stranger to paying for books. I read a LOT, and especially over the past year it hasn’t been unheard of for me to spend well over hundred dollars per month on eBook content alone, which I do for many reasons. Here’s the 8th-grade equation demostrating how I can scientifically demonstrate the value of this technology in my life:

Knowledge Gained (in the fictional unit of “knols“, K) x Ease of Future Reference (in the subjective economic unit of utils) / Content Cost (in dollars, $) x Total Consumption Time (in hours, 3600 x s)

This new unit of electronic book value that I’ll refer to as a Vebu–short for “value of ebookS unit”–reduces to this:

Vebu == knol utils per 3600 dollar seconds == uK/3600$s

In other words, we need to maximize the availability of meaningful information (knol utils) at a minimum of money and time (dollars hours) to achieve maximum value for our electronic virtual book libary, Vebu. A simple, unsophisticated yet meaningful quantity.

But here’s how this effed up market effects Vebu:

  • I have no less than 7 different, largely incompatible pieces of eBook reader software on iPad alone, as of today. Kindle, iBooks, Borders, B&N, Stanza, Free Books and Wattpad. (Effect: lower u, lowering Vebu.)
  • Borders, Barnes & Noble and the other brick-and-mortar vendors are freaking out, scaring they’ll become the next Blockbuster of the Netflix era. Each has their own application that works primarily with their own store, but not much else, forcing you to use their reader. Not all software is availble on all platforms, though, sometimes making lookups a major pain, and different retails of course carry different publishers, so it’s easy to unwittingly get sucked into all of them. (Effect: lower u and higher s, significantly lowering Vebu.)
  • None of the distributor reader apps are keen on “sharing” your content with friends/colleagues, forcing others to re-purchase content you should have been able to at least “lend” to them in the freakin’ first place. (Effect: higher $, lowering Vebu.)
  • O’Reilly, PragProg and other publishers don’t think the major distributors should be necessary, and some are leading the charge buy allowing you to directly purchase digital editions in a variety of formats. This is fine–I have no major qualms about this–but since most readers applications are trying to push you to the store of the vendor that wrote the apps, importing data can be a headache. (Effect: higher s, lowering Vebu.)
  • Amazon, already having a huge content delivery infrastructure, offers propriety features such as cross-device synchronization of bookmarks and highlights that isn’t as good in others. The Kindle hardware will also read to you in the car, but they only sync with Amazon services; Apple’s iBooks/iTunes is better with PDFs but doesn’t have text-to-speach; Stanza aggregates many different content sources but isn’t as great with commercial stuff… everything has distinct pros and cons. They’re all different and I have to use all of them because they can’t/won’t talk to each other and I can never remember which damn content locker to which I committed my stuff. (Effect: lower u, higher s, significantly lowering Vebu.)
  • The problem grows exponentially greater as more retailers, publishers, application developers, and independent authors enter the market, intentionally building walls that consumers have no interest in observing.

Here’s what needs to happen.

If you’re Amazon, Borders, B&N, or really any retailer that is gung-ho about becoming the provider of individual data lockers, that’s fine, but you need to give us the key. It’s understandable that you’re reluctant to open up your formats in a way that could be consumed in ways you can’t control, but consider this: if you never figure out how to allow publisher content to cross application and retailer boundaries, you are effictively capping Vebu to artifically low levels. If you instead focus on optimizing all the variables instead of restraining them, you’ll have a platform unmatched even by Amazon. I, for one, would switch to it in a heartbeat.

Ruby: On The Perl Origins Of “&&” And “||” Versus “and” And “or”.


Avdi Grimm has a recent post noting the use of “and” and “or” Ruby keywords as essentially control flow operators, and hinting on their Perl origins. This instantly recalled several mental notes of old Perl programs, so I though I’d put out a few quick notes on the Perl equivalents for Ruby programmers not versed in Perl 5.

Perl Semantics

Perl actually borrows the precedence rules of “&&” and “||” from C, though I’m not entirely convinced the Perl and Ruby semantics of “and” and “or” are identical. We have to remember that Larry Wall is a linguist, and that some of Perl’s idiosyncrasies are due more to human considerations than machine. Programming Perl has several pages of great content on “and” and “or”. For example, here’s an excerpt from the circa 2000 version of Programming Perl:

But you can’t just up and replace instances of || with or. Suppose you change this:

$xyz = $x || $y || $z;

to this:

$xyz = $x or $y or $z; # WRONG

That wouldn’t do the same thing at all! The precedence of the assignment is higher than “or” but lower than ||, so it would always assign $x to $xyz, and then do the ors. To get the same effect as ||, you’d have to write:

$xyz = ($x or $y or $z);

The moral of the story is that you must still learn precedence (or use parentheses) no matter which variety of logical operator you use.

What Avdi says about precedence and control flow seems correct, though the reasons for it–in Perl at least–is to differentiate between two subtly distinct human linguistic semantics. For example, consider the following two English statements.

  1. I’m either traveling by car or just staying home.
  2. I’m either traveling by car or by train.

At first glance the semantics seem identical, but on closer inspection they are completely different. Let’s look at each.

I’m either traveling by car or just staying home.

In Perl, this is the equivalent of:

$traveling_by_car = can_use_car($me) or stay_home($me);

The semantics of or is to describe consequence (aka control flow) in a specific order. Either I’m traveling_by_car(), or else screw the whole thing and I’ll just stay_home() instead, in which case it makes little difference what the value of $traveling_by_car is. In other words, the semantics of the assignment trump that of the consequence, and should the consequence occur (which in this case probably has side effects), the assignment probably doesn’t matter. This is why Perl users often use or in the following context instead of ||:

open FOO, $file or die “a horrible death: $!”;
@lines = <FOO> or die “$file is empty?”;

I’m either traveling by car or by train.

In English, this is a misleading statement of truth for several reasons.

  1. What we mean is that one of the two options is true, but one and only one. As programmers we know this better as an xor statement–A is true if and only if B is false, and vice versa–but since there is no colloquial English equivalent of xor, we instinctively infer the meaning from the speakers misstatement.
  2. As programmers we tend to look at || as a short-circuit operator evaluated left to right. But this English statement defines no explicit evaluation order. This could have been written “I’m either traveling by train or by car” (notice “car” and “train” are reversed), but it means the same thing. In other words, we are evaluating the arguments for truth, not consequence. Thus, order should not affect truth.

Closing Thoughts

In most real-world Ruby contexts these operators will work interchangeably so long as precedence order is considered. But even so, keep in mind that mixing the two styles in different contexts is not necessarily a sign of inconsistency. Using the appropriate operator may actually be a sign of maturity: a way of communicating slightly different semantics in an otherwise logically equivalent context.

The $1K CD/DVD/LightScribe Replicator: The DIY Guide To Manufacturing Your Own Discs For Less Than $1 Each

This do-it-yourself replicator features eight Lite-On CD/DVD burners. By flipping the disc over you can burn images onto the top using the drive lasers.

I’ve slowly updated components of The $1K Home Studio over the last few years, but have never had a low-cost, DIY solution for disc replication. After playing with external CD burners and evaluating various proprietary hardware options such as the Aleratec auto-flip burner , MicroBoard tower replicators amongst many others, I decided that the current commercial solutions are nice, but most definitely overpriced. So I decided to develop my own solution. This custom-built behemoth is built from common off-the-shelf (COTS) hardware from Fry’s Electronics and inexpensive commercial software. It costs less to own than commercially branded replicators, and also functions as a normal desktop computer since it runs Windows 7 and Linux. (I took care to also buy a Gigabyte-brand motherboard that supposedly supports the OSx86 (“hackintosh”) project, but have had little success with the installation.)


  • Intel i5 750 64-bit CPU. (Features 4 cores.)
  • 4GB RAM.
  • 8 x (yes, eight) Lite-On CD/DVD 5.25″ SATA burner drives.
  • Gigabyte motherboard with lots of SATA ports.
  • Add-on SATA card. (Most motherboards won’t have enough connectors, especially if you have 8 x burners plus 4 x hard drives. 🙂 )
  • Big-ass power supply. (The first one I bought wouldn’t even boot the thing. I put in a monster and everything started working.)


The point of all these burners is to burn simultaneously to all of them, but Windows 7 and OS X cannot do this out of the box. Only a small subset of CD/DVD burning software on the market supports parallel burning, and some only seems to support multiple burners for specific types of burns. What’s worked best for me so far is…

  • Nero Multimedia Suite 10 for concurrent audio and data burning with multiple burners. You don’t have a lot of easy-to-use alternatives here, and I’ve also noticed a few glitches with Nero. Keep your eye out for sales here and you can pick up a copy dirt cheap.
  • Acoustica CD/DVD Label Maker for concurrent LightScribe replication across multiple burners. Again, not a lot of options here. The free software from does not support multiple burners, though some vendor-specific bundles seem to. (LaCie’s LightScribe software in particular appears to support simultaneous LightScribe burns, and they also have a Mac version. I would have went with a Mac-based solution, but 8 x USB 2.0 drives probably would not work so well.)
CDs burned with LightScribe technology. Discs come in many different colors.

I decided to create all my replicated discs using LightScribe technology. This allows me to flip LightScribe CD-Rs upside-down in the burner and use the laser to burn custom graphics onto the top of the disc. I also made the command decision to use COTS cd sleeves instead of CD Jewel cases or slimline cases. The plastic ones are more expensive, always crack, and are pretty much useless from the start since most people seem to rip their CDs nowadays anyway. Sleeves protect the disc, come in many colors, are far less expensive, even cheaper in bulk, and perhaps best of all can be printed on directly though ordinary laser and ink jet printer.

The system runs Windows and Ubuntu. Additional drives are interchanged using hot-swap SATA drive modules.

System Pros

  • Inexpensive initial fixed cost of hardware parts and software licenses.
  • Inexpensive variable cost per disc since LightScribe labeling uses the drive laser instead of ink. There are no costly consumables to replace. (Ordinary LightScribe media purchased in bulk works great.)
  • Quick data, audio and LightScribe replication using 8 concurrent burners.
  • Doable by anyone capable of building of PC with a little time can build one.
  • Functions beautifully as a normal desktop computer.

System Cons

  • Not completely automated like some commercial units because disc loading, unloading and flipping (if using LightScribe) is a manual process.
  • Still uses CD-Rs. These are not the same as commercially pressed mass media discs, but a lot cheaper.
  • (This one is only applicable to audio.) I’ve yet to find inexpensive parallel burning software that can handle DDP images. (The standard in “Red Book” audio CD mastering.)
  • Since LightScribe labeling uses the drive laser instead of ink, disc labels are grayscale only. (Note: You have a lot of options in disc color, though, so it’s not a big deal. Just use your creativity.)

Replication Process Overview

Label four empty CD pancakes to manage the assembly line replication process. If you don't you'll get your disc piles confused!

My primary purpose for this buildout is to replicate audio CDs as quickly as possible for Sonic Binge Records: the awesome music production company. In particular, I need to quickly replicate a pancakes worth (usually 25-50) of audio CDs as inexpensively as possible. After much trial and error with the process, this is what I’ve found works best.

  1. Create final CD master image. (For me that’s using WaveBurner on a Mac. For replication purposes it doesn’t really matter as long as the master is good.)
  2. Take four empty CD pancake containers and label them “Blank”, “Burned”, “Labeled”, and “Ready” to create an assembly line process. You can of course save these for future jobs.
  3. Use Nero Burning ROM to replicate batches of 8 at a time. When they’re done, be sure to put them in the “Burned” stack so you don’t get burned discs confused with “Blank” discs.
  4. While they’re burning, create a square grayscale graphic for LightScribe burning. (Free label creator software is available, though anything like Photoshop works too. I usually use a combination of Photoshop and Acoustica.)
  5. Use Acoustica to label batches of 8 at a time. Each batch will take a while. Full-disc burns seems to take around 30 minutes per batch: much longer than the data/audio side of a standard CD-R. Moved discs to the “Ready” pile when they’re done. (Note: The “Labeled” pile is for discs that have been LightScribe labeled but not burned with data or audio. You can end up in this situation when using multiple computers to do burning.)
  6. While they’re burning, use your favorite document application to design your printed CD sleeves. I’ve started buying color variety packs in bulk packs of 300 to keep options high and costs down.
  7. Bulk print the entire order of sleeves in a single run. As long as you can set the size of the feeder tray, your existing feeder should work fine. (CAUTION: remember that the “window” is made of plastic, and can melt if exposed to heat. Think twice before trying your laser printer. 🙂 )
  8. Take discs from your “Ready” pile (as they finish getting labeled) and slip them into sleeves to create the final product, suitable for general distribution. The imaging lasering adds a great, distinctive touch, and of course you can get as creative as you want with the sleeves, too.
  9. Done! (aka beer time.)


  • Fixed: ~$1K for the machine build, with about $400 of that just for the burners. I reused/reposed parts from old junker machines where I could, and could have saved some money by buying online. I was in a rush and just went to the store.
  • Variable: Roughly $0.40 – $1.00 per disc, depending on the disc quality, packaging, ink etc. you decide to use for each project. (All things considered, the $0.40 version looks pretty decent!)

Closing Thoughts

If you’re a musician without computer skills I would not recommend attempting this project, but if you feel fairly comfortable putting together machines, it’s honestly not that hard. It’s just a PC, after all. (Disclaimer: I do have a degree in Computer Science and Engineering, so my perspective of “not that hard” may be a bit skewed.)

I hope you’ve found this rough how-to guide both inspirational and informative. It’s very useful to have a replication machine handy, and if you’re actively working with people on projects intend for distribution it’s a great investment!

Please use this comments section for all your general comments and questions and I’d be happy to address them. Thanks for reading!

Major Seagate/Maxtor Fail

drive_failIt’s Friday, 10pm, and I’m not a happy camper. This picture is me holding a pile of ordinary hard drives I keep on my home desk. They are cycling backup drives and are not in any way frequently used. Four are Seagate Barracudas–one of which I’ve already had replaced–and the fifth a Maxtor DiamondMax. The oldest of the bunch appears to be from 2002 and all are PATA 200-250GB models. 

I’m unhappy because I picked them up tonight to run a very infrequent backup of all my household data: over a TiB worth while requiring the use of all of them for a complete home backup. Much too my dismay, I won’t be running any backups this weekend.

Failure rate: 100%. (5 out of 5 failures.)

I haven’t been this unhappy with a manufacturer since the last of my IBM DeathStars failed around 2003. Fortunately all the Seagate models are still under warrantee, but such performance is still disheartening and frustrating.

What’s happened to quality drive manufacturing in the 21st century? Some of the ~10MB hard drives in my 486-era machines easily lasted 10+ years, but a single drive these days lasting over 3 seems ever more scarce. Sigh.