August 7, 2010

BBC News, pt III

There’s been a hiatus in our efforts to rewrite the BBC News app in HTML5 – no thanks to the odd tropical storm, a consuming day-job project, and kids who’ve decided that boomerang toucans take priority over their father’s Javascript hacking. So let’s get back to it.

We left things at the point where we had a static HTML skeleton that sort of looked like it could be the BBC News app if you squinted at it. Now it’s time to make it work.

To start with, we need to get our HTML file to be able to make calls to the BBC servers and retrieve JSON- and XML-based payloads. Immediately we run into a problem here, because browsers won’t make requests to different domains (or will, but probably won’t receive a successful response). I can’t upload my humble little HTML file onto the BBC servers, and of course I can’t get them to respond with JSONP or play the cross-origin preflight request game.

So I need to do a little server-side proxying of my own. It seems that Safari refuses to set the user-agent header in AJAX calls anyway – and if you recall from our earlier discovery, that’s important for this app. The proxy will be able to do this user-agent rewriting for us too.

Since I’m running whitherapps.com on an nginx server, running such a proxy is incredibly easy. I’ve simply added this to the server’s config file:

location /apps/bbc-news/proxy {
  resolver 8.8.8.8;
  proxy_pass $http_x_bbc_url;
  proxy_set_header User-Agent $http_x_bbc_ua;
}

…which basically means that any request to http://whitherapps.com/apps/bbc-news/proxy will be proxied on to the URL present in the X-BBC-URL header (as resolved by Google’s DNS server), and the user-agent will be rewritten to whatever is in the X-BBC-UA header. (Not using nginx for your own web server? Why not?)

In the Javascript for the app, I’d better provide a small AJAX handler that sets these headers appropriately and calls back to a provided success function (with the response text) or a failure function (with the XHR instance) as required. I also append a timestamp to the end of the URL, since I want to be sure that the browser doesn’t cache it, confused by my burying of the changing URL inside the headers.

var BBC_PROXY = "/apps/bbc-news/proxy";
var BBC_USER_AGENT = "BBC News 1.2.1 (iPad; iPhone OS 3.2; en_US)";
function bbcXhr(url, success, failure) {
  var xhr = new XMLHttpRequest();
  xhr.open("GET", BBC_PROXY + '?' + new Date().getTime(), true);
  xhr.onreadystatechange = function() {
    if (xhr.readyState == 4) {
      if (xhr.status == 200) {
        success(xhr.responseText);
      } else {
        failure(xhr);
      }
    }
  }
  xhr.setRequestHeader("X-BBC-URL", url);
  xhr.setRequestHeader("X-BBC-UA", BBC_USER_AGENT);
  xhr.send("");
}

(If I was the BBC and wanted to do this properly, I would simply need to alter this function and/or accept requests from other user-agents, and/or respond to cross-origin requests on these URLs… confusingly the app needs to access a range of different domains.)

So, with the app-to-server communication more or less in place, let’s start using it.

If you recall from our original investigation into the app traffic, the first request is to a bootstrap URL that returns some helpful pointers in a JSON file. We need to call that when the app is refreshed (or indeed when it first loads up):

var BBC_BOOTSTRAP = "http://www.live.bbc.co.uk/moira/feeds/ipad/news/en/v1";
function bbcRefresh() {
  bbcXhr(BBC_BOOTSTRAP, function(responseText) {
    var bootstrap = JSON.parse(responseText);
  }, function(xhr) {});
}

addEventListener("load", function() {
  bbcRefresh();
}, false);

While we are at it, let’s create a crude refresh button in the control bar at the top of the app. That should help with testing things:

[<a href='#' id='bbc_refresh'>Refresh</a>]

And in the page load event, add a handler for it:

document.getElementById('bbc_refresh').onclick = function() {
  bbcRefresh();
  return false;
};

Put those in our app file, and… yes, it works: I get the slab of JSON back, and it’s parsed into the bootstrap object by the JSON parser available in Safari – which, although I suppose I do trust the BBC, feels far safer than eval().

Now, there’s a bunch of interesting things in that structure, such as links to privacy URLs and the like, but for now, we’ll just iterate through the list of feeds. These will populate the sections of the UI.

I’ve noticed that the real BBC News app loads the news from the top six or so categories preemptively, but when you click on the other, slightly more obscure categories (like the one for that little-known place, “Europe”), it goes and fetches them on-demand. I presume this is the default=true property set on the major feeds and that the other feeds just don’t work if you’ve never opened them and then go offline.

Rather than taking the feed information and creating the UI elements directly, I’ve decided I’ll do it in two logical steps. First get the feeds and stash the data about them in a local data store, and then secondly construct the UI from the contents of the data store. This is primarily to cater for users who go offline during the life of the app, or who start it up in offline mode and want to see what was there last time.

In fact, now I think about it, the easiest way to do this might be to wrap the local storage functionality in and around the AJAX function. We’ll key data in the browser’s local datastore off the URL. If the device is online, the browser gets fresh data, otherwise it gets it from the local storage.

So here’s a slightly beefier AJAX function that makes this happen behind the scenes:

function bbcCachedXhr(url, success, failure) {
  var xhr = new XMLHttpRequest();
  xhr.open("GET", BBC_PROXY + '?' + new Date().getTime(), true);
  xhr.onreadystatechange = function() {
    if (xhr.readyState == 4) {
      if (xhr.status == 200) {
        localStorage.removeItem(url);
        localStorage.setItem(url, xhr.responseText)
        success(xhr.responseText);
      } else {
        responseText = localStorage.getItem(url);
        if (responseText != null) {
          success(responseText);
        } else {
          failure(xhr);
        }
      }
    }
  }
  xhr.setRequestHeader("X-BBC-URL", url);
  xhr.setRequestHeader("X-BBC-UA", BBC_USER_AGENT);
  xhr.send("");
}

It still calls either a success or failure function, but the former will be called even if the AJAX request is unsuccessful (for whatever reason) but the data was still in the local storage.

This approach works for the BBC app because there are a relatively small number of URLs in play: all the articles for a given section are in one single structure at a stable URL. If there had been a URL for each article, I might have been worried about the cache filling up after a few weeks… but this seems OK for now. This week’s 10 articles will simply overwrite all of last week’s, since they are keyed off the same URL.

I have an idea that this function will be very useful both here and in future WhitherApps project.

Right then. Let’s take the feeds (either just fetched, or rescued from local storage), fetch the articles and create the UI. In part 2, we set up some dummy category panels. Let’s remove them from the static HTML and simple leave their containers:

<nav id='bbc_category_selector'></nav>
<div id='bbc_article_selectors'></div>

So we need a function that creates all the pieces needed for a given feed category selector. There are two titles (one which appears in the portrait mode, and one in landscape), and the selector panel itself. This function creates all three – if they are not present – and returns the main selector panel.

function bbcEnsureSelector(title) {
  var id = title.replace(/[^a-zA-Z0-9_]/g, '_')
  if (document.getElementById(id+'-t1') == null) {
    var selectorT1 = bbcCreateNode('a', id+'-t1', title);
    document.getElementById('bbc_category_selector').appendChild(selectorT1);
  }
  if (document.getElementById(id+'-t2') == null) {
    var selectorT2 = bbcCreateNode('h2', id+'-t2', title);
    document.getElementById('bbc_article_selectors').appendChild(selectorT2);
  }
  var selector = document.getElementById(id)
  if (selector == null) {
    selector = bbcCreateNode('nav', id);
    document.getElementById('bbc_article_selectors').appendChild(selector);
  }
  return selector;
}

function bbcCreateNode(tag, id, innerText) {
  var node = document.createElement(tag);
  if (id!=null) {
    node.id = id;
  }
  if (innerText!=null) {
    node.appendChild(document.createTextNode(innerText));
  }
  return node;
}

This may look a little clumsy, but it does include making a safe unique ID (from the title) and using DOM methods to ensure the resulting markup structure is well formed. (I’m thinking that a few of these functions might get moved into a separate library at some point.)

Then, as just after we parse the bootstrap in the refresh function, let’s call this for each feed:

var feeds = bootstrap.feeds[0].feeds;
for (var f in feeds) {
  var feed = feeds[f];
  var selector = bbcEnsureSelector(feed.title);
}

Seems to work (“At last! A screenshot!”):

Category titles appearing in the left sidebar

Now, each of these category selector panels need to have the article thumbnails placed in them. This should happen when the bootstrap is loaded (for those that have default=true in the bootstrap, it seems), and also when a selector is shown that hasn’t previously been populated. So we’ll do it in a separate function that can be called in either case.

function bbcPopulateSelector(selector, url) {
  bbcCachedXhr(url, function(responseText){
    var feed = new DOMParser().parseFromString(responseText, 'text/xml').documentElement;
    if (feed.nodeName=='feed') {
      var feedId = feed.getElementsByTagName('id')[0].textContent;
      var entries = feed.getElementsByTagName('entry');
      for (e=0; e < entries.length; e++) {
        entry = entries[e];
        var entryId = entry.getElementsByTagName('id')[0].textContent;

        var id = feedId + '-' + entryId;
        var thumbnail = bbcCreateNode('a', id+'-thumbnail');

        var thumbnailImage = bbcCreateNode('img', '');
        thumbnailImage.src = bbcRewriteMediaUrl(
          entry.getElementsByTagNameNS('*', 'thumbnail')[0].getAttribute('url'),
          entryId
        );
        thumbnail.appendChild(thumbnailImage);

        var thumbnailTitle = bbcCreateNode('h3', null,
          entry.getElementsByTagName('title')[0].textContent
        );
        thumbnail.appendChild(thumbnailTitle);

        var content = entry.getElementsByTagName('content')[0];
        var contentNodes = content.childNodes;
        var article = bbcCreateNode('div', id+'-article');
        for (var i=0; i < contentNodes.length; i++) {
          article.appendChild(document.importNode(contentNodes[i], true));
        }

        thumbnail.article = article;

        selector.appendChild(thumbnail);
      }
    }
  }, function(xhr){});
}

function bbcRewriteMediaUrl(url, base) {
  if (url.substr(0, 11)!='bbcimage://') {
    return url;
  }
  url = url.substr(11);
  url = decodeURIComponent(url);
  if (url.substr(0, base.length)!=base) {
    return url;
  }
  url = url.substr(base.length);
  url = url.replace('{device}', 'ipad');
  return 'http://' + url;
}

So what’s going on here? Firstly we call the feed URL (again, cached via localStorage if offline), and then parse the results. For each article entry in the feed, we get its ID, and create a thumbnail link that will sit in the article-selecting category tray. We use the thumbnail image to actually get a picture for it (using the bbcRewriteMediaUrl function to transform its rather strange syntax, as observed in part I), and add the title of the article beneath it. We have to use getElementsByTagNameNS to get the thumbnail image URL, due to the media RSS namespace on that element.

Then, embedded in the Atom is a slab of HTML under the <content> element for the article itself. We get its children out and lash them into a container held under the thumbnail’s DOM using the ‘article’ attribute. I’m not entirely sure if this is the most kosher way to do things, but it will do for now. Basically we want it in a place where we can switch it into the main reading panel whenever someone clicks on – or rather, touches – the thumbnail.

After the selector panel is known to be in place in the refresh sequence, let’s call this with URL of every default feed. We have to use ‘default’ as an array index when examining the feed object, rather than an object property, because it’s a reserved word in Javascript and Safari complains otherwise:

if (feed['default']) {
  bbcPopulateSelector(selector, feed.feed_url);
}

OK, so let’s run that up and see what happens. Hopefully, the bootstrap and then the default feeds will be fetched. We should then see a panel appear for each of the categories, and the images and titles of the articles within them. And…

The thumbnails ...not looking quite right

Well, OK. It did what I claimed, more or less, but we obviously have a few cosmetic problems here. Our category trays are of a constrained height, and the thumbnails are not at all styled to fit them. But it shouldn’t take a lot of CSS to sort this out. Firstly, let’s sort out the tray overflow, and then the display layout of the thumbnails themselves. (We’ll deal with making scrolling look nice in due course – the iPad doesn’t show scroll bars for overflow:scroll, although you can use two fingers to move the trays left and right).

So let’s make the following small additions to the CSS:

#bbc_article_selectors nav {
  ...
   overflow-x:scroll; overflow-y:hidden;
   white-space:nowrap;
}
#bbc_article_selectors nav a {
  display:inline-block;
  width:115px; height:137px;
  margin:5px;
  white-space:normal;
  overflow:hidden;
  background:#000000;
  border:1px solid #ffffff;
}

What a difference that makes. Arnie looks a bit odd (which is surely setting somebody up for a punchline) for an as-yet-unknown reason… but most of the thumbnails now look respectable. We’ll sort out the typography soon enough, but even with the default <h3> styling we can still read most of the headlines.

Thumbnails looking a little better

Of course we should make these <a> thumbnails actually do something. After all, we have the text of the article hanging off an expando DOM property, so let’s put it to good use. We can bind an article display function onto its click event at the end of the bbcPopulateSelector function. It’s tempting to use an anonymous function here, but I think I’ll need this in other places eventually (since in the app you can swipe articles to change them too).

...
  thumbnail.onclick = bbcDisplayArticle;
...

function bbcDisplayArticle(event, thumbnail) {
  if (thumbnail==null) {
    thumbnail = this;
  }
  var articlePanel = document.getElementById('bbc_article');
  articlePanel.innerHTML = '';
  articlePanel.appendChild(thumbnail.article);
  return false;
}

Crude for now, but this latter function will be a good place to put in some swanky page sliding animation at some point.

So at least we can now pull up articles:

Houston, we have an article

Lovely. Seems to be one final thing to do though: rewrite the images that appear in articles. It makes sense to do this once – probably at the time we are stealing the article out of the Atom DOM and putting it into the HTML one. Let’s just do a quick scan for <img> tags and rewrite their src attributes, just before the importNode function puts them into the DOM (this means that Safari won’t try to go and get them before we can get them into the right syntax).

var contentImages = content.getElementsByTagName('img');
for (var i=0; i < contentImages.length; i++) {
  contentImages[i].src = bbcRewriteMediaUrl(contentImages[i].src, entryId);
}

It turns out that some of the images are placeholders for BBC video files too: these are <a> tags with bbcvideo:// schemes and bandwidth metadata. Let’s look for those too, and enhance our media URL rewriter:

var contentVideos = content.getElementsByTagName('a');
for (var i=0; i < contentVideos.length; i++) {
  contentVideos[i].href = bbcRewriteMediaUrl(contentVideos[i].href, entryId);
}

if (url.substr(0, 11)!='bbcimage://' &&
    url.substr(0, 11)!='bbcvideo://') {
...
   url = url.replace('{bandwidth}', 'wifi');
...

There’s more to the video thing than meets the eye. If you try the app you’ll be confronted with a cryptic MP4 URL in a manifest of some sort, and so I’ll need to do a bit more hacking to get the videos showing. Something to do with British TV license payers or some such nonsense. But at least we now have images within the articles:

Articles with pictures (just don't click on the video links!)

And, if you compare it with a screen shot of the real thing, you’ll see we’re actually not doing too badly:

The real thing, earlier today

That’s it for this time. But we have an app you can just about use. Please check it out on your iPads here (but don’t all go too crazy at the same time, since it’s using my server as the proxy!) and see what you think. Yes, I know the portrait version doesn’t work yet, and the scrolling is pretty poor… but hey, come on.

In the next episode, after all this heavy lifting, I think I deserve to play around with some CSS. Lots of lovely -webkit-gradient coming up (I think that’s what you were all waiting for, anyway).

Back soon.

Comments (23)

  1. August 8, 2010

    [...] This post was mentioned on Twitter by Jason, HTML5 Guy. HTML5 Guy said: Rewriting BBC News iPad app in HTML5 http://bit.ly/d1VIfL YES, YES, YES: Rewriting BBC News iPad app in HTML5 http… http://bit.ly/c9Z2qz [...]

  2. August 8, 2010
    Jeff Sonstein said...

    you might want to take a look at how JQuery handles cross-site JSONP requests, it would not ne too hard to use their approach in your own code w/o loading JQuery itself

    - jeffs -

    • August 9, 2010
      James said...

      Right. But surely that still needs the server to play the game (with pre-flight checks etc etc)… obviously something I don’t have control over.

  3. August 8, 2010

    You should be careful with that nginx proxying configuration. Anyone can craft an HTTP request with an X_BBC_URL that isn’t the BBC site, and use your server as an open proxy.

    • August 9, 2010
      James said...

      Yeah – this project just got a whole load more attention than I was expecting, so it’s getting locked down.

  4. August 8, 2010
    arnaud said...

    the js templating library might come in handy, relatively small.

    http://github.com/janl/mustache.js/

  5. August 19, 2010
    Phunky said...

    You should use YQL for this, with a couple of clever JOINs you could scrap all the content in one request to Yahoo! Seriously good work tho, this has actually inspired me to give it ago myself :)

    If you need any help give us a shout :D

    • August 19, 2010
      James said...

      Thanks!

      I hadn’t thought too much about things like YQL… probably because I discovered early on that the BBC only played the game when sent the right user-agent. (It probably is possible to get YQL to send a particular header, but I didn’t think to check… figured I’d try to make everything self-contained)

      I have a growing list of apps that could do with some debunking if you want to be a guest author! :-)

      • August 8, 2011
        Symona said...

        That’s really thniikng out of the box. Thanks!

  6. August 21, 2010

    [...] has already produced three blog posts rewriting the BBC iPhone app but with HTML5 (Part I, Part II, Part III). I encourage you to read them. He’s already gotten impressively far; here is a screenshot of [...]

  7. August 21, 2010

    [...] This post was mentioned on Twitter by aslund, aslund. aslund said: BBC-News-iPad-App-Nachbau: http://whitherapps.com/bbc-news http://whitherapps.com/bbc-news-pt-ii http://whitherapps.com/bbc-news-pt-iii [...]

  8. August 22, 2010

    [...] a web app based on HTML5. He has already started 3 blog post on the BBC News apps, Part1, Part2 and Part3. If you see the screenshot below he has gotten pretty far. BBC News (HTML5) BBC News [...]

  9. August 23, 2010
    Dave said...

    Fascinating series, really enjoying reading it. If you are serious about exploring this idea with any other apps and are looking for help, I’d love to be involved!

  10. August 26, 2010
    John Holdun said...

    Oh but this is some inspired work so far. I’ll try not to make a habit of commenting on every post, but add me to the list of willing participants!

  11. September 15, 2010
    Ryan Ore said...

    I am sincerely looking forward to reading more. Its very inspiring for people trying to focus on web technologies for app development. Thanks.

  12. September 18, 2010
    Swagat Barman said...

    Hi Guys,
    If you are Mobile App Developer, Here is a chance to showcase your App in front of 1000+ live audience only at Mobile Developer Summit 2010 in Bangalore, by just sending a small pitch which will offer you a chance a glory, fame and the potential to meet the people that can take your App to the next level. For details log on to mobildevelopersummit.com

  13. March 2, 2011
    Cherly Taaffe said...

    hopefully this comment doesn’t appear multiple times (it appears to freeze once i try to post my comment.. not certain if it’s actually posting), but all I really wanted to say was fantastic post and thanks for sharing.

  14. March 4, 2011
    Stella Pelzel said...

    This always amazes me just how Blog owners for example your self can find enough time as well as the commitment to keep on Creating fantastic blogposts your blog is great and one of my must read personal blogs, I was more than amazed with the post I simply had to say thanks and congratulations
    Best wishes

  15. March 5, 2011

    Hey would you mind stating which blog platform you’re using? I’m going to start my own blog soon but I’m having a difficult time deciding between BlogEngine/WordPress/B2evolution and Drupal. The reason I ask is because your layout seems different then most blogs and I’m looking for something completely unique. P.S Sorry for getting off-topic but I had to ask!

  16. March 14, 2011
    dsgfsdg said...

    The reason I ask is because your layout seems different then most blogs and I’m looking for something completely unique. P.S Sorry for getting off-topic but I had to ask!

  17. June 2, 2011

    Hey, I don’t think telling you this on your post BBC News, pt III WhitherApps is the best place however I couldn’t find a contact page form on your somewhat cluttered theme (i’m sorry). My visitors used to tell me the same thing so I swapped over to a better theme from http://tinyurl.com/themeforestz. I have only gotten compliments ever since. Regards, Sherika Chesebro

  18. June 20, 2011
    Lepebleaphsen said...

    Reаding is regularly а sedentаry аnd solitаry аctivity, but it doesn’t hаve to be. If you аre teаching students reаding skills, use gаmes to аllow them to go together in their study of this vassal exposed to аnd prаctice reаding in аn engаging mаnner. By аdding this competitive element to reаding instruction, you cаn beguile tranquil the most reluctаnt reаder to tаke аt leаst some pleаsure in reаding.

    1.
    Figure Indisposed
    *

    Relieve students wrаp their heаds аround the outline of the piece they аre reаding close engаging them in а intrigue sort аctivity. On listing cаrds, write minus events thаt hаppened in the section of the reserve thаt your students most recently reаd. Medley these cаrds. Supply а pupil or а troop of students these muddled up cаrds аnd аsk them to put them in the befitting system аs speedily аs they cаn. To mаke this аctivity competitive, time their efforts аnd present а haul to the trainee who completes the tаsk first.
    Chаrаcter Chаrаcteristics Describe
    *

    аcquаint your students with the chаrаcters who fill their school-book around аsking them to sort chаrаcter chаrаcteristics. To prepаre on the side of this аctivity, variety up аnd print out chаrаcter nаmes аnd аttаch these to the wаll or chаlkboаrd. On thesaurus cаrds or slips of pаper, inscribe or breed terms thаt could describe chаrаcters in the text. Owing exаmple, if the lyrics feаtures а 15-yeаr-old betrothed who hаs blond hаir аnd offensive eyes, you could write “dispirited eyes” on one miscalculate, “blond hаir” on аnother аnd “15″ on а third. When over and over again to plаy аrrives, present students these slips of pаper аnd аsk them to аttаch them to the chаlkboаrd nautical below-decks the nаmes of the chаrаcters they describe.
    Plot Predictor Chаllenge
    *
    cool reading games

  19. November 13, 2011
    degussa gold said...

    That is the right weblog for anybody who wants to search out out about this topic. You realize so much its almost arduous to argue with you (not that I really would want…HaHa). You definitely put a new spin on a subject thats been written about for years. Nice stuff, just nice!

Leave a Reply