I recently had a need to speedily parse through 8GiB+ .fastq text files to calculate a simple statistic of genomic data. My initial “pfastqcount” implementation in Ruby worked fine, but with many files to process took longer than I had hoped in addition to consuming an alarming amount of CPU. I ended up reimplementing the pfastqcount command-line program in C, which takes one or more .fastq files, memory maps them, and creates the statistic. Simply dropping my algorithm down to raw C significantly sped up the process and reduced CPU usage, especially coming from an interpreted language. If any of you bioinformaticians find the need to implement a FASTQ data processing algorithm in C, I encourage you to fork the project and use it as a template. The project is Apache 2.0 licensed for your convenience and publicly available on GitHub.
Author: preston.lee
-
Hacking Your Own Genome
Many thanks to everyone that participated in my Hacking Your Own Genome session at today’s Desert Code Camp 2011.2 event in Chandler, Arizona! I’m very passionate about the topic area, and hope the session was both entertaining and useful. Here are the presentation materials, source code for the “youandme” application, and a few other links you might find useful. Thanks again for the opportunity and don’t hesitate to reach out and stay connected!
Presentation:
- “Hacking Your Own Genome” slide deck. View online, or download in multiple formats.
- “youandme” demo project source code and instructions for starting to work with 23andme raw data sets.
Additional Links:
- 23andme genetic testing services.
- Promethease SNP profile data mining application.
Have fun! -
Amazon Finally Adds Lending Library
Late is better than never! (From the press release. Customer details here.)
With an Amazon Prime membership, Kindle owners can now choose from thousands of books to borrow for free - including over 100 current and former New York Times Bestsellers - as frequently as a book a month, with no due dates Books can be borrowed and read on all Kindle E Ink devices and Kindle Fire
Not exactly the most giving of terms, but it’s a start. Thanks to the Kindle team!
-
10 Universal Weight Loss Tips For Men
I’ve spent the majority of my adult life in the “obese” clinical strata, and have only in recent years taken to caring about maintaining a reasonable weight. I am an American male, 5’10”, with Koren-Caucation ancestry and have ranged between 149 and 172 pounds in the past year. My bodies individual natural comfort zone seems to be in the 150-155 range. At my heaviest I capped out in the 200-210 range. In other words, my physical stature and default dietary habits are spectacularly unspectacular for an American, and I consider myself fairly representative of the “average American male”. I lost most of those excess pounds (180+) in a fairly short amount of time. Everyone is unique in many ways, but from my own research and personal experimentation I believe these points to be largely universal for adult men.
Weight And Health
Weight loss does not necessarily correlate to health gain. It’s possible to lose weight on a diet of Twinkies, but you would be seriously lacking in dietary components despite being lighter, and most likely put yourself at higher risk of heart disease and diabetes. Assuming that part of your motivation for weight loss comes from a desire for better health and longevity, remember to see the forest through the trees. It’s great to look healthy, and better to be healthy.
As a general guidelines, stick to eating actual foods. (Edible substances like high fructose corn syrup should not be considered “food”.) If you couldn’t produce the ingredient if you really wanted to, you probably don’t want to eat it. You have tons of zero-calorie sugar replacements–Splenda, Nutrasweet etc.–but these are not magic bullets and generally should be avoided as “crutch” substances. See Michael Pollan’s excellent Food Rules for guidelines.
10 Tips
- Weight yourself daily at a consistent time with no excuses. It’s especially important to continue weighting yourself when you’re struggling to hold yourself accountable and to prevent prolonged lapses of judgement.
- Treat weight management as a lifestyle, not a program. Programs are things you do for a short period of time before going back to the status quo. Lifestyle changes are long-term investments made for the benefit of yourself and those you love.
- Drink water and tea when you are thirsty. Have other tasty beverages for enjoyment, not to quench thirst. Beer and other alcoholic drinks are unfortunately high in calories, as are many sodas and even fruit juices. Water first.
- Shop when you’re full. Plowing through the aisles on an omg-I-have-nothing-to-eat rampage is going to result in a cabinet full of snacks. You body evolved to crave certain foods to compensate for natural rarity. When you’re hungry, reason goes out the door, and satisfying cravings for those foods that are now readily available becomes the easiest fix.
- Visit only upper-tier merchants such as Whole Foods and Trader Joe’s when at all possible. In addition to higher quality foods, they do a much better job than conventional grocery stores of not barraging you with excess junk. Fruits and vegetables are also of notably higher quality and tastiness.
- Maintain the lifestyle because “nothing tastes as good as fit feels”, not to punish or deprive yourself.
- Talk about solutions with others doing the same. Being around others taking action is extremely encouraging and motivating. Keep in mind the exact opposite also applies.
- Focus more on diet than exercise. Both are necessary, but you’d be better served with a good diet and only 30 minutes of exercise per week than horrible diet and 4 hours of exercise per week. Many weight loss systems prescribe disciplined physical regiments, but remember that diet matters more.
- Weight train for weight loss. Additional muscle mass allows you to burn calories faster, even when you’re not exercising. Cardiovascular exercise is great for your heart and blood pressure, but doesn’t build the calorie burning, protein-consuming muscle like weight training does. Also remember that you cannot control where you lose weight: only where you build muscle. No one is going to see your rock hard abdominal muscles if your mouth can’t trade in the cheese sandwiches.
- Know when to break the rules. If you use a formal system such as Paleo or Atkins you may have strict guidelines. At some point, however, most foods are going to be ok in moderation so long as you can control yourself. It’s ok to not be perfect!
Good living and good luck. 🙂
-
Preston And AT&T Give Each Other The Finger
I made the jump the last day before Verizon stopped offering unlimited data plans. The delay for switching to Verizon was not lack of motivation–AT&T service has always paled to Verizon in Arizona and is nearly nonexistent at my summer home–but in desperate procrastination of dealing with the migration process. My longest conversation (highly abbreviated) with AT&T on the matter took about an hour and was so traumatizing that I can’t see myself ever returning. As far as I’m concerned AT&T is dead and buried:
Coverage map showing AT&T's miraculous ability to provide "Best" service without operational towers. Me: I’m not happy with my AT&T service and would like to cancel my service plan.
Customer Service Representative: I’m sorry to hear that, sir. May I ask why?
Me: I’m in an area with about 1-bar service about half the time, no 3G data (EDGE only), and constant dropped calls. I’m not really getting “service” per se.
Rep: I’m very sorry to hear that. We can cancel your plan for $<huge fee>.
Me: Well… I really don’t think that’s entirely fair. The issue isn’t really that I don’t WANT service, but AT&T isn’t providing what I’m already paying for. I’m paying about $100/month for unlimited 3G data, <list of other features>, and I only get a few of them some of the time. Check the coverage map.
Rep: Yes, sir! I can see you live in a “Best Coverage” area. That is very good!
Me: 1-bar signal 50% of the time, no 3G and dropped calls the other 50% is “Best Coverage”?
Rep: The map shows we have multiple towers in the area! You should be getting great service according to the map.
Me: I understand what the map says; I’ve seen it many times, trust me. The issue is not just me, though. No one else with AT&T seems to get usable service here, either.
Rep: I’m very sorry to hear that, sir. One of the towers is not operational. That may have something to do with it. Would you like us to send out an engineer to test your hardware?
Me: Wait… what? First, my hardware is fine. It works fine in <other cities with service>. No one else’s phone works well here on AT&T’s network, either. Second, if you’re dispatching an engineer wouldn’t it make sense to fix the tower instead? …You know, the NOT OPERATIONAL one that is currently providing “Best Coverage”?
Rep: Unfortunately we cannot do that, sir.
Me: It doesn’t make sense to charge me for a service you just admitted you can’t provide. I understand I’m under contract and don’t dispute that, but AT&T has obligations, too, and if AT&T can’t meet them it isn’t right to punish the customer.
Rep: Unfortunately, sir, it is your fault for choosing to live in an area without good service coverage.
Me: ARE YOU FUCKING SERIOUS??? I checked your goddamn map before, during and after moving here, and the fucking thing says “BEST COVERAGE” despite having a non-operational tower. I’ve been here for some time now and it’s never been any better.
Rep: Yes, sir! Coverage in that area is strong. Would you like us to send out an engineering to test your equipment?
Me: YOU JUST SAID A TOWER IS NOT OPERATIONAL. SOME GUY WAIVING MY PHONE TO THE SKY IS NOT GOING TO MAKE IT CONNECT TO A TOWER THAT DOESN’T WORK. DO YOU UNDERSTAND WHAT THE FUCK A CELL TOWER DOES?
Rep: Like I said, sir, it is AT&T’s policy to charge cancellation fees according to your contract. We cannot even consider overriding them in such a strong service area.
Me: <Infuriated abrupt disconnect.>
So, I’m now staring at a ridiculous cancelation bill. On the bright side, though, I sold my old AT&T iPhone the next week via eBay for over $200, which not only covered the new Verizon hardware cost but activation fees as well. I’m not getting great 4G on my Mifi (which was disclosed though), but at least I’m getting ok 3G and voice service on my iPhone for about the same price. AT&T? Please.
-
Apple Wireless Bluetooth Keyboard Dvorak Hacking
It’s very easy to switch around the keys on Apple’s current generation of wireless bluetooth keyboard. The first time doing this hack I only had one “close call”, but know that I know how to get the keys off safely it’s simple, fast and easy. I used a couple razor blades to pop off the keys, but a very small screwdriver would work just as well and be much safer.
The keys are best removed by lifting the cap key from the top left or top right corners. The plastic mechanics beneath the key move analogously to a cherry picker, and you interfere with them less by lifting the top corners of the key.
Once you’re done, the only drawback to your new sleek Dvorak keyboard is the lack of nubs on the U and H keys. Very carefully dab a small drop of superglue on them to address the issue, and enjoy!
(Sorry for the lack of pictures … when I figure out where I put them I’ll update this post. Andy Skelton has some pictures in a similar post.)
-
I Bought A Grill At Walmart
Yes, I know, so you can withhold your scorn in the satisfaction that I’ve been promptly punished by the powers that be.
Upon initial assembly of my new $25 *Deluxe* propane grill, I assumed that the small Chinese boy manning the “‘L’ washer” machine at the grill factory simply hadn’t received adequate training from HR during the sweatshop orientation process. Or maybe he’s a union kid and was on a smoke break.. Or perhaps felt like leaving out the sixteenth washer in the knowledge that some American chump would eventually rifle though every inch of packaging looking for the part that never existed.
Approximately 2/3rds through assembly, I came to the realization that the shrink-wrapped parts smell funny: a contagious-like biological odor that I imagine a despair machine must smell like. While thinking about koala bears was much more fun than trying to remember what SARS stands for, I was brought back to sadness when I realized that Mr. NoPaidOvertime Jr. also left out a ‘C’ washer. GREAT. This wan’t going well and I still had a handful of parts left on the table.
Assembly completed. Wait… Nevermind, back to step six to add the metal thing to the other metal thing.
Assembly completed? Yes! And then, suddenly… EXISTENTIAL CLARITY.
This is, without a doubt, the worst grill I have ever bought, used, fondled and, possibly, ever set my eyes upon. It is not merely a poor devise, but in strong contention for *poorest* device. It is as if an investor found an abandoned warehouse of unrelated parts and had to decide between making radiators, fire extinguishers, or grills. The feet that are supposed to fold the damn thing into a tidy ball of grill were bent in six different ways out of the box, and the handle clearly didn’t come out of the sadness-injection machine correctly.
The whole supply chain here is crap. It’s not just the factory, but design, delivery, retail, economics, ethics.. Everything about this product and the devil spawn it came from is horrible for America. Is this news? No. But we all need a periodic reminder that the crap we habitually clamor for isn’t doing us any favors.
Please don’t shop at Walmart.
-
MountWest RubyConf 2011 – Journeta Lightning Talk Slides
The “slides” from my MountainWest RubyConf 2011 lightning talk on Journeta have been added to the source tree, here. Go make a peer-aware text editor or something!
-
Textbooks: What Publishers Don’t Understand About The Internet
The Kindle 3G + Wifi eBook Reader Textbook publishers in 2011 still aren’t fully appreciating the impact the Internet will have on their industry. A reasonably forward-thinking individual might optimistically assume the industry is self-correcting towards the wants and needs of consumers, but that doesn’t seem to be the case. Let’s explore:
Electronic typesetting.
Physical textbooks obviously can’t be reissued every time a typo is corrected. That’s fine, so we can keep making large textbook changes via en-mass “editions” to save typesetting efforts.
But electronic textbooks have many not-so-obvious differences.
- Screen sizes of reader hardware/software vary dramatically.
- Even if screen sizes were the same, it is of tremendous value to allow the user to change font and text size.
- Some screens support color, while other don’t. A wonderful color graphic may appear a blobby mess on a monochrome reader.
- The concept of a “page” no longer exists, due to #1 and #2, above. Content cannot simply say “See page 32.” References must be dynamic links, instead.
- Content can (and should be) linkable. Obvious examples are tables of contents and figure references. External links need to be supported, as well as more sophisticated “interactive” embedded content items. (A mathematics textbook with an exercise that asks, “Y = 3X + 2. Calculate Y for the following X values: 0, 4, 5.7.” should also grade the assignment as well. Why do I need a completely different book for this?)
- Searching, highlighting, note taking, and content sharing are all critical “must have” features for electronic texts.
- Open data interchange is probably the biggest techno-political challenge. Retailers aren’t yet jumping on the opportunity to exchange data with the competition. (But they will need to conceed because it’s what the consumers and publishers will want.)
Adobe’s portable document format is a dying beast, as is Adobe itself. PDFs just doesn’t work well for textbooks. For all these reasons, please stop calling your PDF renderings “eBooks” and then calling it a day. PDF documents cannot “reflow” the way a web page does, and make reading extremely awkward because of reasons #1 and #2, above. In short, direct PDF conversions–such as those used by the University of Phoenix–don’t have any of the typesetting considerations or functional niceties of modern electron book formats, and should be avoided. Schools need to stop accepting cheap “Print To PDF”-style textbooks, as well as “eBooks” that can only be read through a web browser using special software that doesn’t support any of the above features. If your eBook implementation is less powerful than a physical book, you’re doing it wrong. Please improve!
Separation of form and content.
Typesetting concerns do not mean all is lost. If anything, it’s a wonderful opportunity to make revolutionary steps in improving the way written knowledge in transferred. As we’ve learned from the web, it’s entirely possible to design for dynamic layouts given you can make at least a few constraints.
Physical textbook typesetting needs to be optimized for a specific target. Electronic typesetting needs to optimize for overall good layout within a range of constraints. Web applications can generate multiple document types for the same content, and with such nimble requirements for electronic media, we can do the same with updated forms of typesetting languages like LaTeX.
eBooks don’t require a local sales representative.
It’s nice, I suppose, to have a rep on call to overnight you a textbook on a moments notice, but that’s not necessary when I can click a button on my iPad. The issue here is misaligned incentives in the payment of distributors.
Doesn’t get where this whole Internet thing is headed. To use a real-world example, my local Pearson rep seems to earn commissions on physical textbook sales to my classes, but not electronic copies sold through Pearson affiliate (or subsidiary?) CourseSmart. She’s always happy to help when I’m interested in buying paper, but suddenly goes unresponsive when I have a tangential question about an electronic book.
It’s not her job to help with online sales. That’s an entirely different business unit or whatever, so who cares about that, right? Here are some great properties of CourseSmart, Pearson’s chosen eBook sales system:
- You can only access your electronic textbook for about 6 months. That’s right, you don’t own it. You’re essentially renting it for the semester.
- The pricing is pretty high, especially considering you can often sell back physical books after the semester. You always get $0 after the rental period. Savings? Please.
- You can’t really do anything neat with the electronic version, like download a simple effing PDF, even if you’re a legitimate, verified instructor that can already download content such as instructor solutions manuals and slides. (They don’t trust us. Trust me on that.)
- Pearson and college sales/support infrastructure and personal incentives aren’t (yet) set up to fluidly handle electronic texts.
The Pragmatic Programmers offer DRM-free, reasonably priced technical books. Check ’em out! In short, CourseSmart sucks. I thought it was going to be cheaper, simpler and generally better for students to use the electronic versions, but given the high cost “rent”-like nature and lack of features, it’s not great. Personally I’m looking to switch to publishers that understand ebook-oriented use cases and build their product to fully take advantage of the Internet, rather than just go through the motions. PragProg is a great example of a technical publisher that’s moving us in the right direction. (I send them a lot of business and highly recommend you check them out, too!)
I have to believe that the profit margins on selling an 800-page textbook as a $60 “online view only for 6 months only” product are greater than a $100 hunk of tree, especially considering the expenses of transporting, retailing, and commissioning (or marking up) every step. I suppose many of those people don’t want to go electronic due to fear of job loss, even though the jobs may simply change, instead.
Fast release cycles.
With properly designed exchange formats, textbooks and metadata can be pushed and pulled between publisher, retailer and consumer in under a second. The concept of “this years edition” starts to lose meaning if the publisher can fix a typo and push out a new revision with no more effort than updating a wiki page. This posses serious technological challenges with ISBNs, Library of Congress records etc., but all these things all fixable, and none of the solutions have anything to do with building a new PDF that gets emailed to me. (Even Amazon doesn’t do this right yet, even with their .azw format. When you agree to receive an optional update of a book you’ve purchased from Amazon, you lose all your notes and highlights from the original version. Lame.)
We need to embrace this idea of rapid content change, rather than cling to the idea of annual product releases. We can do it. Really.
Closing thoughts.
All the players in the textbook industry have different incentive systems, but all have much to gain. Rather than using the friendly neighborhood college bookstore as a primary retail outlet, the supply chain process… no, the entire industry, needs a comprehensive dose of cold water to the face. All is not lost, but in 2011? They still don’t get it.