A JavaScript function to extract text from an anchor tag (using a regex)

There are probably better ways to do this, but here’s a JavaScript function I wrote for a Sencha ExtJS application that extracts the text from an anchor tag (hyperlink):

// expects something like '<a href="#" class="taskName">foo bar baz</a>'
getTextFromHyperlink: function(linkText) {
    return linkText.match(/<a [^>]+>([^<]+)<\/a>/)[1];
}

The regular expression in the match method does the dirty work. I find it hard to explain regular expressions, but the important part is that I extract the text portion of the anchor tag in this code:

([^<]+)

The parentheses indicate that I want to capture/extract this portion of the regex, while ignoring the rest of it. The JavaScript string match method returns an array, and because I want this part that I’ve captured, I return the 1st element of the array. (With the match method, the 0th array element is the entire input string, which can be both helpful, bizarre, and confusing.)

A little regex explanation

Okay, here’s a short explanation of that regex:

  • <a [^>]+> matches the opening <a>
  • ([^<]+) matches what I want to capture/extract
  • <\/a> matches the closing anchor tag

Again, there are probably better ways to do this -- including DOM-related approaches I don’t know yet -- but if you need a JavaScript function to extract the text from an HTML anchor tag, I hope this is a helpful start.