Earth Notes: On Website Technicals (2020-09)

Updated 2020-09-22 21:20 GMT.
Tech updates: Brotli side dish, AMP https preferred, H2 oddity, anchor ads away, forever compression, https://www, 92222[2]...

2020-09-22: Cache-Control Simplification

I have simplified the config for the m. and amp. files. They now have a uniform Cache-Control for all files, set at ~1 day for amp. (to avoid larding on top of the AMP cache too much) and ~11 days for m. to minimise traffic even at the cost of being a bit more stale. Having the same value for all objects per site shoudl maximise H2 (and H3) compression.

The Expires header is not used to reduce header size. (Accept-Ranges and Etag are also omitted.)

The actual Cache-Control max-ages used are 92222 and 922222 seconds for amp. and m. respectively for an efficient representation for H1, H2, and H3. (For H2 and H3 the static Huffman code sizes for the digits themselves are relevant.)

The 'public' value is unlikely to help overall, so is omitted.

2020-09-21: Canonical WWW Now HTTPS

Here goes nothing!

I'm switching the canonical (desktop) to be https. Let's see what happens...

I will not be astonished if there is some turmoil, maybe several weeks' worth! Let's hope it's all sorted by the winter solstice!

2020-09-22: I have updated my MachMetrics account to poll the https version of the desktop/canonical/www home page.

2020-09-20: Forever Expiry Time 31536000s

Noting the special support for an expiry time of 31536000 seconds (365 days) to mean 'forever' in QPACK: Header Compression for HTTP/3 (H3), I am making that the 'forever' time for EOU (www/static) too.

I've also removed the Firefox-only 'immutable' from the Cache-Control header. (In fact Safari also supports 'immutable', I see.) Most browsers probably won't use it, it takes some space, and prevents use of the built-in H3 static header compression entry.

The magic config line (to exactly match H3 the static table entry) is now:

Header set Cache-Control "public, max-age=31536000"

2020-09-18: Anchor AutoAds off

A 90-day AdSense test finished today, which indicated that having a heavier-than-average ad load did not generate more revenue than a below-average ad load. So I'm back on the less-pushy below-average setting.

I also took the opportunity to turn off the 'anchor' ads that stick to the top of (desktop) pages when viewed on mobile. I find them distracting and a significant waste of screen real-estate.

It's difficult to tell for sure, but I think that traffic from Google is coming off its peak of the last year or so, at least looking at the GSC performance graph. Impressions are good, but actual clickthroughs less convincing...

2020-09-17: H2 Strangeness

I tried out HTTP/2 Test: Verify HTTP/2 support. For I get HTTP/2 protocol is supported and ALPN extension is supported. (Both are also supported over plain http, but no browser will make use of that in practice! For some reason although http AMP shows the same, http m says that both are unsupported, on https also: odd.)

It turns out there was some stray Let's Encrypt 'auto' config hanging around:

# ls -al /etc/apache2/sites-enabled/ -> ../sites-available/ -> /etc/apache2/sites-available/

Removing the link, and restarting Apache makes all well with the world, or at least well with the HTTP/2 test:

# rm /etc/apache2/sites-enabled/
# /etc/init.d/apache2 restart

A quick survey of the fraction of HTTPS connections using HTTP/2 suggests that it's about one third. Note that Googlebot doesn't yet use HTTP/2 for example.

% egrep ':443 .* HTTP/1' /var/log/apache2/other_vhosts_access.log | wc -l
% egrep ':443 .* HTTP/2' /var/log/apache2/other_vhosts_access.log | wc -l

2020-09-04: AMP HTTPS Now Preferred

As of today the preferred scheme for the AMP pages is https, though http will still be served and should be fully functional.

This means in practice:

  • Making the amphtml link in the header and AMP navigation link point to the https version.
  • Giving all the header inter-version and canonical and navigation links explicit http / https schemes.
  • Making redirects between versions appropriate in terms of schemes, retaining https as appropriate to avoid security snafus (principle of least surprise).
  • Using the correct scheme in robots.txt and sitemap.xml as appropriate.

It will take a little while to get all the wrinkles out!

In future, the m/lite preferred version is likely to remain http for speed, and the www/desktop preferred version become https for a small SEO boost.

2020-09-01: Brotli Sides

I have enabled Brotli static pre-compression for supporting top-level pages, such as the home and sitemap pages. If that doesn't cause any problems then I shall extend such br content-encoding to main pages also, either side of making the https set canonical for (say) AMP and desktop. (Brotli compression only works over https.)

: I weakened and have turned on Brotli precompression for all main pages, having seen at least some (legit) spiders starting to fetch content over https, where it may help. (Not all spiders can Accept-Encoding: br even with https though.)