Analyzing NZ Herald’s Sources

For those outside of NZ, this post is about NZ’s largest national newspaper, “The New Zealand Herald”. If you don’t live in NZ you might not find it that interesting, but it’s still a good look into how journalism within NZ is slowly being shut down and replaced with clickbait type stories and syndicated content.

Over the past month there has been a couple of articles floating around the web, most notably this one by Russel Brown, and another piece by David Farrar. They talk about how the NZ Herald’s online edition seems to be filled up more and more by “Daily Mail” type news. Usually those stories that have a headline with “…. And what happened next will amaze you!” or “See what made *B Grade Celebrity here* cry”. On top of that I’ve began to notice that many stories listed online at the Herald are simply scraped/copy pasted articles from Associate Press or another online newspaper. Essentially making our national newspaper syndicated garbage.

At the bottom of every article you can usually find the “source” of the article. It looks a bit like this :


It got me thinking. Because every article has a tag on where it came from, it should be easy to do a quick scrape of the website and tell us just how much of the Herald’s content is actually theirs, and how much is syndicated. I did think about doing a massive crawl all over the website, but it seemed easier just to pick all the front page stories and check them. So I quickly whipped up a tool to do just that. And the results will shock you! (har har)

After running my app I ended up with a set of totals that looked like this. Note that (blank) means there was no syndication marker. I believe these are from the Herald (Possibly online only stories).

Source Count
NZ Herald 62
Associated Press 36
Daily Mail 8 6
Bay Of Plenty Times 4
Northern Advocate 4
(blank) 4
Canvas 2
Daily Telegraph UK 2
Hawkes Bay Today 2
Washington Post 2
Herald On Sunday 1
Christchurch Star 1
The Country 1
Wanganui Chronicle 1

When we actually group “types” of sources together. We end up with something like this.

Source Count
Herald Sources 69
Local Sources 10
Other Sources 54

So we can see that the Herald itself only makes up half of it’s own content, the rest comes from either local sources or from “Other sources” such as the Daily Mail, the AP feed, or other overseas partners.

What is clear is that the Herald loves using associated press. I could be wrong, but the entire latest news section on the herald is a straight feed from AP with no editorial done on it what so ever. So all of these stories are un-edited straight syndication.


It’s kinda interesting to me because a while back, “autoblogs” used to be this big thing. Where you set up a blog and simply have it publishing 10 different feeds, not even editing the articles along the way. But Google got tired of the same content being in multiple places so started to detect the “original” if you will, and only rank that one. So I’m interested in how Google feels about the fact that all these places are posting the exact same article for clicks/views/whatever.

I took an article and searched for the exact title in Google to see how many places it’s showed up. As of this post, there is 3500+ exact copies of this article floating around, probably all posted verbatim from the AP feed.


As I guessed, the “original” article on AP is the top result in Google. Because of this, it makes me question what exactly is the point in re-posting the feed as is on the Herald. Although I will never know, I wonder how many people are actually reading these stories there, or whether they are just “bloat” designed to make it look like the Herald is always up to date with the latest news, even if it isn’t theirs.

If you are interested in the actual data. I’ve uploaded the Excel spreadsheet I used to pull the data here. As always, I love graphs/data comparisons in the comments!

UPDATE : Check out part 2 here!

Analyzing NZ Herald’s Sources

2 thoughts on “Analyzing NZ Herald’s Sources

    1. Wade says:

      I just need to rewrite my code to go deeper than the homepage, it’s not difficult, just have to find the time. I’ve had quite a few people ask the same question (I didn’t think the post would gain as much traction as it did). I plan some time this week to get around to changing my app to pull 100 pages per category and check their source. I want the pages to all be within the past month so that we are looking at a similar sort of timeframe the entire way through. From that I should be able to get 1000+ articles with their sources, and have much better numbers.

Comments are closed.