A Tale of Two Link Aggregators

At 2025-11-15T15:38:20Z, I submitted my Goofing on Meta blog post to the Hacker News link aggregation website. Shortly after that, at 2025-11-15T18:22:00Z, short blurb tech news website slashdot.org put an article about the same blog past in the Technology News section.

Since I run my own web server, I can see how the two sites’ users compare.

The Hacker News post did not make it to the front page, and thus did not get much human attention. The Hacker News post got 3 upvotes, 0 (zero) comments. This is the same experience I’ve had with other Hacker News submissions.

The slashdot post got 43 comments. It doesn’t look like Slashdot articles show upvotes. I suspect that the Slashdot submitter noticed the post on Hacker News, but it’s possibly a complete coincidence.

The comments on Slashdot are nearly incomprehensible, and reflect worldviews rooted in verifiably false assumptions. Slashdot articles also aren’t as ephemeral as Hack News submissions. They are at least curated by editors, not simply submitted by internet randos.

A trip through my Apache “combined” format log files shows even more differences.

What does a human’s access look like?

If I use a web browser to view one of my own blog’s posts, I see a series of HTTP GET requests in the current Apache httpd log file.

Path part of URL Referrer
/posts/kea-dhcp-daemon/ -
/css/style.css httpx://bruceediger.com/posts/kea-dhcp-daemon/
/images/network_diagram2.png httpx://bruceediger.com/posts/kea-dhcp-daemon/
/images/Dirk.willems.rescue.ncs_30.png httpx://bruceediger.com/posts/kea-dhcp-daemon/
/avatar.php httpx://bruceediger.com/posts/kea-dhcp-daemon/
/js/menu.js httpx://bruceediger.com/posts/kea-dhcp-daemon/
/favicon.ico httpx://bruceediger.com/posts/kea-dhcp-daemon/

I did that with a nice, fresh, browser process that hadn’t pulled anything from my blog for a while. It had nothing cached, so it retrieved every JavaScript, CSS file and image referenced in the post’s HTML. That’s similar browser behavior to what an interested reader clicking on a link in Hacker News or Slashdot would cause.

Because I typed in the URL of my Kea DHCP daemon blog post, that GET request has no referrer. Requests resulting from clicks on Hacker News links have referrers of https://news.ycombinator.com/. Requests resulting from clicks on Slashdot links have referrers of https://slashdot.org/. Any requests for supporting files in blog posts have referrers of the URL of the blog post.

I decided that an IP address associated with a human would:

  1. GET a URL with referrer of https://news.ycombinator.com/ or https://slashdot.org/.
  2. Subsequently GET /css/style.css with a referrer of https://bruceediger.com/posts/..*

It looks to me like known bots/spiders (Drakma, Google, Bing, etc etc) rarely if ever bother to retrieve CSS or JavaScript files. I’m certain that a large number of bots access my blog with User Agent strings that indicate a human’s web browser. I’ve noticed that most or all of those bots also don’t retrieve CSS or JavaScript files referenced in blog post HTML.

I believe that a clever person could use grep and other Linux shell tools to capture IP addresses retrieving files according to my criteria, but I’d already written a Go-regexp-package based, “combined” log file format parser. I had an idea where I’d graft a logical expression evaluator on that program to allow selecting log file lines that indicated retrieval of /css/style.css AND had had a referrer matching https://bruceediger.com/posts/. I’d already written logical and arithmetic expression lexers and parsers for a couple of other projects, so I felt compelled to carry out this idea.

Code for “combined” format log file parser/extractor.

As just one example, this command line finds log file lines that have a blog post path, and a Slashdot referrer, printing only the IP address:

combined -f ipaddr \
     -e 'url~/\/posts\/..*/ &&  referrer~/https:\/\/.*slashdot.org\//' \
    access.log.5

I had to use a regular expression match (indicated by the ~) for referrer field matching. Slashdot sends various “referrer” strings like https://m.slashdot.org/, https://slashdot.org/, https://slashdot.org/?nobeta=1 and https://tech.slashdot.org/.

The procedure for finding “human” blog post accesses goes like this:

  1. Collect IP addresses from Apache log files that retrieved a blog post and have a Slashdot or Hacker News referrer.
  2. From that collection, sieve out the IP addresses that retrieved /style/style.css with a blog post referrer.
  3. Select all the Apache log file lines that have those sieved-out IP addresses.

There’s an additional nuance to the log file lines selected in step (3). If you select them as they appear in the log file, you see interwoven sets of accesses (as described above) from different IP addresses. I found it worthwhile to sort the selected log file lines on IP addresses. The GNU sort command will break ties based on other (whitespace-separated) fields in the sorted lines by default. I had to use sort -k1,1 -k4,4 --stable to keep sort from sorting on the “path” part of the URL, which appears in the log files. Timestamp resolution is only 1 second, so sorting on the timestamp without the --stable flag gets you log files sorted at least partially on lexical comparisons of path.

Hacker News

Between 2025-11-15T16:52:34Z and 2025-11-15T17:22:05Z, 24 humans requested the linked blog post from a Hacker News referral. No other Hacker News referrals have appeared in my log files since. Of those 24, 1 (one) person requested another URL, my infinite website post, linked-to in the first few paragraphs of “goofing” post. Hacker News readers that can be bothered to click the link on the HN page are a low curiosity group.

OS Count
MacOS Intel 8
Android 6
Windows 4
iOS 5
Linux 3

That sums to 26. One IP address requested the linked blog post twice, one as Windows, once as Android. Another IP address requested the URL twice, both times as MacOS.

This is not in alignment with a recent Hacker News poll about what OS responders use to develop software. November 15, 2025 was on a Saturday. Perhaps poll responders use other things on days off.

Slashdot

Between 2025-11-15:19:57:18Z and 2025-11-25T23:36:20Z the linked blog post got requested by 2521 IP addresses referred from Slashdot. I count 2500 of those IP addresses as being directed by humans. That’s different from Hacker News referrals. There’s also a very long tail of referrals from Slashdot. My blog got referrals 10 days after the initial Slashdot article appeared.

Slashdot user agent strings are far more varied than Hacker News user agent strings. The following is very rough.

OS Count
MacOS Intel 888
Linux 859
Windows 780
Android 501
iOS 365
iPad 28

That’s 3421 different operating systems. Looks like a lot more Slashdot IPs have NAT or maybe I did a worse job of filtering out bots on Slashdot referrals. Macs and Linux machines predominate, and there’s substantially more Android users than iPhone.

The IP addresses of the human Slashdot referrals went on to request 283 of the 379 blog posts I’d made at the time.

The top 25 most requested posts:

Post No. of requests Currently on Page
Goofing on Meta 1704 1
My Infinite Website 1468 26
Pacman Fix up 38 1
GPS PPS Separation 33 1
Weird Sign 32 7
Dyck Languages 31 2
What To Do When Nothing Has Happened 30 1
Two podcasts reviewing Starship Troopers 30 1
Page 2 24 2
Page 3 21 3
Hazmat Placards I have seen 18 2
Depth-first Unary Degree Sequence 17 1
Weird 5G Rant 16 2
90 Degree Pick Tool 16 3
Benford’s Law, 2013 and 2025 16 1
The Mammoth Site 15 26
Page 4 15 4
Compound Steam Locomotive 14 26
Page 5 14 5
More Starship Troopers Fanfic Ideas 13 2
Page 6 11 6
Printer Advice 11 26
Cybercrime is (often) boring 10 3
Folders Are Bad 10 25
Information Camouflage 10 29

The “Goofing on Meta” and “My Infinite Website” posts are both links in the Slashdot article, explaining their high request count. I don’t understand why 2500 humans requested blog posts, but only 1704 requested the post that Slashdot linked to. Some people must have not clicked the title, but clicked the Slashdot link to “My Infinite Website”, which had to have confused them.

Some of the humans requested up to 21 of the “pages” of posts. My blog is set to do 10 posts per page.

48 “tags” URLs got requested. The top 5: Starship Troopers, Cybercrime, Tools, Internet, Computer Science. That checks out. To my mind, I only have 43 tags at the moment. A few tags have more than 10 posts associated, so there are “pages” inside of some tags. Some of the Top 25 most requested posts did not appear on “pages” indexes. Slashdot humans found them from various “tags” pages.

Conclusions

  1. Hacker News readers are very different than Slashdot readers.
  2. Slashdot readers are more numerous than Hacker News readers, are not Windows users, and are more curious.
  3. Unless a web page reaches the Hacker News “front page”, a Hacker News link does not get a page in front of as many people as a Slashdot post.
  4. The Slashdot Effect doesn’t seem to exist right now, but a few of the people referred by Slasshdot will look over your entire website.

Value of my “combined” format log file parser/extractor

My “combined” format log file parser/extractor was of great value when finding information about what Slashdot human-associated IP addresses had requested.

I wanted to see what URLs (paths, really) got requested from “tags” pages on my blog. I had a “combined” log file format file already prepared based on the “which IPs are associated with humans” procedure above. A one line command sufficed to see that:

combined -e 'url~/\/posts\/..*/ && referrer~/bruceediger.com\/tags\//' \
    -f url,referrer  slashdot.sorted.human.log

When I was extracting IP addresses that I deemed were controlled by humans, I was discouraged. I could have done those tasks with shell scripts and command line tools. I don’t think I could have easily answered some of the questions that came to me while manually inspecting the human-associated IP address log file lines.

Around the Web

The “Goofing on Meta” post made the rounds.

Naked Capitalism

Lobste.rs

Linkedin, for pity’s sake

Polish tech news site Elektroda

Some random podcast

Looks to me like Lobste.rs picked it up from Hacker News, all the others from Slashdot.