Skip to main content

Why do URL-based ad blockers work?

Disclaimer: I work for Google but not on any of the ads teams. This is a personal post.

When Pete Snyder filed WICG/webpackage#551 that Web Bundles might break ad blockers, I had to figure out what makes those ad blockers work in the first place.

Now, obviously, if a page loads an ad from a URL, then blocking that URL will block the ad. But the web is an evolving system, and ad blockers are in an adversarial relationship with publishers, advertisers, and ad-tech companies who all want to make sure users see their ads. Those ad folks are smart and capable of finding ways around naïve attempts to block their ads. So what prevents them from avoiding that list of URLs? Why do URL-based ad blockers keep working?

This post primarily tries to answer that question, not to answer Pete’s concern, but it does eventually come back to web bundles’ effect on ad blocking: they don’t really affect any of the reasons sites haven’t pursued an arms race with ad blockers.

Sites that don’t try to evade §

A user who has installed an ad blocker has sent a pretty clear signal that they don’t want to see ads. Advertisers and publishers may not want to risk angering such users by showing them ads anyway.

And even though around a quarter of all web users use ad blockers, that may still not be enough to pay a publisher to engage in an arms race with ad blockers.

Functionality that needs an online endpoint §

The far bigger reason that ad blockers keep working is that advertisements are usually fetched based on the results of an auction that runs as the surrounding page is downloaded. Whoever runs that auction accepts requests at some URL and responds with ads. That URL is a nice stable target for ad blockers. The auctioneer could dodge the blocker by changing the URL whenever it gets blocked, but then they have to find a way to update all of the publishers’ pages that were written to call that URL. That’s a big logistical problem.

The first thing the auctioneer might try is to have the publishers load a <script src="https://auctioneer.example/auctioneer.js"> that includes the dynamically updating auction endpoint. This is often known as an “ad tag” and is usually the way ads are served even when they’re not trying to avoid ad blockers. But, oops, now the ad blockers are blocking auctioneer.js, and the auctioneer is back to the original problem.

Obfuscate the URL §

It’s straightforward to obfuscate the URL for the auction endpoint, for example by encrypting it with the current date and even a key provided to the particular publisher. The auctioneer can decrypt the request on their server, and run the resulting auction. If the auctioneer isn’t careful, this will lead to their entire domain being blocked, but they might be lucky enough to run a popular website on the same domain, which ad-blocker users would be sad to lose access to. They’ll need to encrypt every resource on the server in the same way to avoid letting the URL-based blockers distinguish.

The bigger problem is that now they have some complicated code copied to every publisher’s page. If that code ever needs to be updated, it’s going to be a problem. And they can’t abstract it into an auctioneer.js for the same reason as before.

Proxy via the first-party server §

The auctioneer could also ask the host of each page to act as a proxy for either the auction request URL or the auctioneer.js posited above. The page would request /any_url_the_publisher_wants.js, and the server would forward that request to the auctioneer and reply with their response. Because of the number of different publishers, it would be difficult for an ad blocker to block all of the script names they picked, and a publisher that wanted to avoid ad blockers could be as creative as they like in rotating those names.

However, this is still more difficult for publishers to adopt than pasting an ad tag on their site, and that difficulty seems to have been enough to stop this technique from being widely adopted. Proxying too much would also make it hard for the auctioneer or advertiser to detect ad fraud, since ad fraud detection currently depends on inspecting connections directly to end-users.

Run a CDN §

The auctioneer could also offer to act as a CDN for publishers that want to avoid ad blockers. By proxying all of the publisher’s content, they can automatically rewrite the ad tags into randomized local references that an ad blocker can’t distinguish from the page’s actual subresources. However, the publisher can only do this with one auctioneer, and they need to trust that auctioneer to do a good job serving all the rest of their content.

What about first-party ads? §

A publisher that sells their own ads might not need to make a separate request for ad blockers to target. Instead, they have a choice between an easy-to-manage URL space with all the ad-related resources in a separate path that ad blockers can target, vs ads mixed indistinguishably among the site’s other resources. The second costs enough development and maintenance time that sites tend not to do it. However, some large sites have chosen to frequently rotate the paths of their ads resources to make it hard for URL-based blockers to keep up.

A first party could also inline ad-related resources into the page itself. Any necessary scripts and styles can be placed at the bottom of the page, and images can either be compiled into the scripts or included with data: URLs. This requires every page of the site to be served dynamically and loses any possible caching benefits from sharing ad resources between pages.

What about non-ad uses of ad blockers? §

It turns out that ad blockers are also used to block other intrusive things, like trackers (including social widgets), big downloads like fonts, fingerprinting scripts, and cryptocurrency miners. Trackers and cryptocurrency miners have to make a network request off the first-party origin in order to send their results, and the URL of that request has to be similarly stable to an ad auction, so ad blockers can block it.

Fingerprinting scripts, on the other hand, only need to report their result to the surrounding page, and some of them provide npm packages for trivial use in website bundlers (like webpack, Rollup, or Parcel). The fingerprinting script can even be bundled with some of the site’s shared code to ensure that it can be cached within the site while ensuring that blocking it will break the site. Ad blockers will only manage to block a fingerprinting script whose host isn’t trying to avoid the blocker.

Big files are easy to re-host locally, but usually aren’t worth the trouble.

How do web bundles affect this? §

Issue #551 claims that Web Bundles make it easier to avoid ad blockers, so how might they do that?

Uses that need an online endpoint will still need one whether or not they’re bundling their code. Ad blockers should continue to target that endpoint. The considerations that make it difficult to move that endpoint around outside a bundle also make it difficult to move it around using bundles.

Uses that only need to get a script to run are already defended by existing Javascript compilers: if a publisher doesn’t care enough about defeating ad blockers to run a compiler, there’s no reason to think they’ll care enough to build a web bundle either.

Bundles provide another way to inline first-party ads, with the improvement of not needing to use data: URLs for images. They come with the same downsides around needing to serve every page dynamically and losing the caching benefits of sharing ad-related resources between pages.

Acknowledgements §

Thanks to Jeff Kaufman, Justin Fagnani, and Michael Kleber for reviewing this post.

This was originally published on Medium.

The Web Bluetooth Security Model

Web Bluetooth is a developing JavaScript API to allow websites to communicate with Bluetooth devices. Sites ask the browser to show a list of nearby Bluetooth devices matching certain criteria, and the user either picks which to grant access to or cancels the dialog.

Image of the Chromium Bluetooth chooser, saying "https://googlechrome.github.io wants to pair with:" followed by a list of two nearby bluetooth devices, a "Polar H7" or an "HR Monitor GO9". At the bottom of the dialog are links to follow if the expected device doesn"t appear and a "Pair" button.
The user can choose which heart rate monitor to grant access to, if any.

As you might expect, there are security risks here. When deciding whether to ship the new API, we should look at several kinds of attackers and defenders:

  • An abusive software developer, trying to do embarrassing or privacy-insensitive things that don’t go outside devices’ security models.
  • A malicious software developer, trying to exploit users using nearby Bluetooth devices.
  • A malicious hardware manufacturer, trying to exploit users or websites who connect to their devices.
  • A malicious manufacturer/developer, who can push cooperating hardware and software.
  • Weakly-written device firmware, which doesn’t intend to hurt its users, but might be vulnerable to malicious connections.
  • Weakly-written kernels, which might be vulnerable to either malicious userland software or malicious connections.

The ultimate decision about whether to ship Web Bluetooth should also take the competitiveness of the web into account, but this article only analyzes the security tradeoffs.

Abusive software developers §

Abusive websites might try to do embarrassing things like configure a Bluetooth speaker to play porn sounds. Web Bluetooth defends against this in several ways:

  • The chooser grants a website access to only the specific devices a user selects, which helps the user associate misbehavior with specific sites and prevents those sites from messing with extra devices.
  • On desktop platforms we show a tab indicator while a site is connected to a device, which also helps associate the site with the misbehaving device. This isn’t perfect, since the site might configure a device to only misbehave later, long after the site has disconnected to stop showing the tab indicator.
  • If users notice misbehavior and revoke a site’s access to a device, we’re looking into ways to aggregate that in a privacy-preserving way and use it to protect other users from that site, either by automatically denying the chooser or by adding an extra warning that the site might be abusive.

Malicious software developers §

In a world with Web Bluetooth, malicious developers will be able to choose between attacking users via native or web apps. We want shipping Web Bluetooth to make their job harder across the combination of both targets.

Getting permission §

Assume the user visits the malicious developer’s website. To grant it permission to attack Bluetooth devices, the user must:

Android M+:

  1. Click on app install banner.
  2. Click ‘Install’ in Play Store. Wait.
  3. Click ‘Open’ in Play Store.
  4. Click ‘Accept’ on a location permission prompt.

iOS:

  1. Click on app install banner.
  2. Click ‘Get’ in App Store.
  3. Click ‘Install’ in App Store. Wait.
  4. Click ‘Open’ in App Store.

Chrome OS (through a Chrome App):

  1. Site calls chrome.webstore.install() inside a user gesture.
  2. Click ‘Add’ on a dialog that mentions Bluetooth. Wait.
  3. Click the app icon.

Web Bluetooth

  1. Site calls navigator.bluetooth.requestDevice() inside a user gesture.
  2. Click the vulnerable device inside a dialog that mentions pairing.
  3. Click ‘Pair’.

Web Bluetooth provides more warning to users than Android or iOS before giving access to the first device. Web Bluetooth also requires the same permission sequence for each additional device, so the malicious developer can’t attack devices the user wasn’t aware of.

Getting permission illicitly §

A developer can also hijack a trusted site’s permission to use Bluetooth devices.

  • Native: XcodeGhost demonstrates that it’s possible to compromise native apps at scale, but to do it you need to compromise development machines.
  • Web: Web sites are often compromised to host malware. Even without being compromised, web sites embed ads that shouldn’t be able to access Bluetooth devices. To make sure ads only get access to expected capabilities, Chris Palmer is proposing a permission delegation API, which Web Bluetooth will use.

Web Bluetooth is probably more vulnerable to this type of attack.

Attacking the kernel through Bluetooth APIs §

The kernel or Bluetooth drivers may be vulnerable to attack from the local machine, or from a remote radio as discussed below. The main defense we have here is to keep the API surface small and to run fuzz tests over that API. Web Bluetooth is helped by the GATT API being relatively small.

Attacking through non-Bluetooth channels §

A user who wants to access a Bluetooth device will follow instructions for how to do so. This may allow other attacks:

  • Native apps find it easier to escape the system sandbox than web apps, at least because web apps have to escape a browser sandbox before even attempting to attack the system.
  • Native apps have more abilities by default than web apps. For example, native apps have raw network access, can execute in the background, and can track users through a persistent advertising ID.
  • Android M+ requires the user grant access to their location in order for an app to communicate over Bluetooth.

If we ship Web Bluetooth, users can get used to simple uses working on the web, which will help restrict the more dangerous native apps to the cases they’re actually needed.

Avoiding blockage §

Before a site or app is discovered to be malicious:

  • Native: App stores have full access to an app’s code and can test it for malicious behavior on hardware they pick. However, because each kind of remote Bluetooth device may speak a different protocol and have different vulnerabilities, the stores basically can’t test for malice and have to allow any messages they don’t know to be harmful.
  • Web: We can’t do an offline scan of a website, but app stores aren’t benefitting from offline scans in this case anyway. We can block the known-harmful messages using an updatable registry of blacklisted services.

After a site or app is discovered to be malicious:

  • Native: Stores can take down all apps uploaded under a single credit card.
  • Web: Safe Browsing can block access to the single malicious website.

Web Bluetooth should be just as good at preventing attacks ahead of time, but doesn’t have as strong a response after we discover an attack.

Attacking the device §

  • Native: The app has access to both GATT and Bluetooth Classic profiles. Classic profiles are byte-stream-based, which makes them harder to parse and more likely to be exploitable. As mentioned above, native apps can also attack all devices in radio range, the entire time they’re installed, without going back through a user prompt.
  • Web: Sites can only communicate over the relatively simple GATT protocol, which maps keys to bounded-length values. Sites can also only attack devices the user explicitly granted access to.

Web Bluetooth does not take the extra CORS-like step of asking devices to opt into the origins that are allowed to communicate with them, but is still less likely to give access to exploitable device code.

Some Bluetooth devices intentionally allow firmware updates over a GATT channel. For example, Nordic Semiconductor has defined a Device Firmware Update service with a default implementation in their SDK. Unfortunately this implementation doesn’t check the update’s signature, which could enable an attack along the lines of the iSeeYou attack on USB. As a result, Web Bluetooth will probably add this service to the blacklist, and restrict unsigned updates to native apps. Firmware update services that do check signatures would not need to be blacklisted.

Malicious hardware manufacturers §

Websites that don’t know about Web Bluetooth aren’t affected by its existence, because they have to make an explicit function call to opt into it.

Because users get to choose the device they connect to a website, websites have to design around being given an incorrect device. They may still make incorrect and exploitable assumptions about how the device will respond to their messages. That said, this only affects the single exploited website: browser sandboxing prevents the damage from leaking to other sites.

Malicious hardware may also be able to work alone to attack a user’s computer, as described in the next section.

Malicious hardware manufacturers who also write websites §

Remote devices can also attempt to exploit a user’s computer. The most well-known example of this is innocent-looking USB devices that behave as keyboards or mice when plugged in. I’m told that neither apps nor browsers can pair with Bluetooth devices in a way that makes the devices into trusted keyboards, but I haven’t seen a reliable published source saying this.

Devices may also be able to attack a user’s kernel, possibly through their Bluetooth drivers. We haven’t yet fuzz-tested this attack surface, but we plan to before shipping the API.

The Physical Web makes it easier for malicious hardware to get users onto their website, than it would be to get them to install a native app. Web Bluetooth needs to validate that remote hardware can’t attack users’ systems through this route.

Conclusion §

  • Web Bluetooth’s ability to pair an application with a single remote device is a big advance toward the principle of least privilege.
  • Reducing the number of native apps users need to install is another big advance given the general power of native apps.
  • Some users’ devices probably will be exploited by malicious websites using Web Bluetooth. We believe the other security benefits will outweigh this.
  • We need to run several more security tests before shipping the API, including fuzzing several operating systems and testing that they don’t automatically grant access for devices to act as keyboards.

Acknowledgements §

Thanks to Adrienne Porter Felt, Chris Palmer, Xifumi, Giovanni Ortuño, Vincent Scheib, Alex Russell, and François Beaufort for reviewing this. Any remaining mistakes are still mine.

This was originally published on Medium.